Multidimensional Scaling

Multidimensional scaling
Yee Jean 12524752 Andrea12524807 Mohan 12524729
What is MDS?
belongs to the more general category of methods for multivariate data analysis Multidimensional scaling is an exploratory technique used to visualize proximities in a low dimensional space relation between a pair of entities = proximities (distance or similarity/dissimilarity) Correlations can be considered to be similarities, hence the usage of correlation matrix
Key Terms
Objects, also called variables or stimuli, are the products, candidates, opinions, or other choices to be compared Subjects are those doing the comparing Sometimes the subjects are termed the "source" and the objects are termed the "target Possible for subjects to rate themselves, in which case subjects and objects are the same Dimensions: usually hierarchies and have one or more levels
Key terms
Euclidean distance
the "ordinary" distance between two points on a plane, and is given by the Pythagorean formula. By using this formula as distance, Euclidean space (or even any inner product space) becomes a metric space. the proximities are then represented in a geometrical space, e.g. in a Euclidean space. Most commonly used space in MDS Sum of squared distances
TOBECHANGED
You ve been asked to fill in this similarity data in two different ways. In the lecture we ll look at the multidimensional scaling of the results
M&M KitKat M&M Snicker Pocky Mentos TicTac
Snicker
Pocky
Mentos
TicTac
Multidimensional scaling
Part of family of techniques called Multidimensional Analyses (MDA) Exploratory data analysis Shepard (1962) and Kruskal (1964)
Method of ordination
Similarities
Dissimilarities
Goals of MDS
Reduce large amounts of data into easy-to-visualize structures Attempts to find structure (visual rep) in distance measures Show how variables/objects are related perceptually Assigning causes to specific locations
How MDS works

An MDS algorithm starts with a matrix of items item similarities
Assign a location to each item in N-dimensional space, where N is specified a priori
For sufficiently small N, the resulting locations may be displayed in a graph or 3D visualisation
MDS process
Obtaining data: type and source Determining proximities Transform/ scale data Fitting into appropriate model Finding stress level Not acceptable stress: transform data or change model
Types of Data
Decompositional/ Attribute-free
most common type rate objects on overall basis without reference to objective attributes Perceptual map
Compositional
Rate objects on variety of ALL specific attributes Object matrices May involve specialized procedures
Data collection: How raw proximities are obtained?

Pairwise comparison method Preference method Confusion data method Direct ranking method Objective methods
Proximity Measures
Types of proximities: similarity/dissimilarity Shape of the datamatrix
number of ways of a data matrix refers to the dimensionality of the data-matrix number of modes refers to the number of unique ways underlying the dissimilarities symmetry of the proximities is often assumed in the muldimensional scaling of square matrices but not always fulfilled
Measurement characteristics of the data
The measurement level relates to the invariance of the proximities under transformations. The usual scales are ratio-, interval-, ordinal and nominal scale. Multidimensional scaling is particularly suited for the analysis of ordinal data, these are the non-metric scaling models. measurement process comes down to the distinction between continuous and discrete: objects measured by a discrete process and belonging to the same category have the same number while objects measured by a continuous proces fall in a range of numbers when belonging to the same category
MDS Models (proximity matrix)

Classical
One proximity matrix (metric or non-metric)
Replicated
Several matrices
Weighted
Aggregate proximities and individual differences in a common MDS space.
Metric or Non-Metric
MDS-analyses which imply uniqueness on the interval level (or stronger levels of uniqueness such as ratio or absolute level) are known as metric MDS or classical scaling. If weaker levels of uniqueness than the interval level are assumed, use is made of socalled non-metric MDS algorithms.
MDS model summary
How MDS works (The iterative MDS-algorithm)
SPSS Case Study

Facial Expressions by Abelson and Sermat (1962) Description:
Dissimilarities of facial expressions for 13 situations 30 students rated pairs of 13 pictures with facial expressions acted by a woman
9-point scale with respect to overall dissimilarity.
Dissimilarity: difference in emotional expression or content
Method
For each subject, 78 proximities resulted Rescaled over individuals by method of successive intervals (Diederich et al., 1957). The means of these intervals were taken as the proximity data.
Method - Measurements
The facial expressions are: 1 Grief at death of mother 2 Savoring a coke 3 Very pleasant surprise 4 Maternal love-baby in arms 5 Physical exhaustion 6 Something wrong with plane 7 Anger at seeing dog beaten 8 Pulling hard on seat of chair 9 Unexpectedly meets old boy friend 10 Revulsion 11 Extreme pain 12 Knows plane will crash 13 Light sleep
SPSS options PROXSCAL ALSCAL
PROXSCAL performs multidimensional scaling of proximity data to find a least-squares representation of the objects in a lowdimensional space. Individual differences models are allowed for multiple sources A majorization algorithm guarantees monotone convergence for optionally transformed metric and nonmetric data under a variety of models and constraints.
alternating least squares scaling

ALSCAL performs metric or nonmetric Multidimensional Scaling and Unfolding with individual differences options. It can analyze one or more matrices of dissimilarity or similarity data. The analysis represents the rows and columns of the data matrix as points in a Euclidean space. If a row and column are similar, then their points are close together, while if the row and column are dissimilar, they are far apart.
SPSS does not allow you to use proximities directly
Proximity matrix: Input data

Proximities Grief Grief . Savor Surprise Love Exhaustion Wrong Anger Pulling Meets Revulsion Pain KnowFear Sleep Savor 4.050 .
Surprise
8.250
2.540
Love
5.570
2.690
2.110
Exhaustion
1.150
2.670
8.980
3.780
Wrong
2.970
3.880
9.270
6.050
2.340
Anger
4.340
8.530
11.870
9.780
7.120
1.360
Pulling
4.900
1.310
2.560
4.210
5.900
5.180
8.470
Meets
6.250
1.880
.740
.450
4.770
5.450
10.200
2.630
Revulsion
1.550
4.840
9.250
4.920
2.220
4.170
5.440
5.450
7.100
Pain
1.680
5.810
7.920
5.420
4.340
4.720
4.310
3.790
6.580
1.980
KnowFear
6.570
7.430
8.300
8.930
8.160
4.660
1.570
6.490
9.770
4.930
4.830
Sleep
3.930
4.510
8.470
3.480
1.600
4.890
9.180
6.050
6.550
4.120
3.510
12.650
Testing for validity/reliability

Split data tests Data stability tests Test- retest reliability
Analysis of results
Stress (phi) is a goodness of fit measure for MDS models
The smaller the stress, the better the fit. High stress may reflect measurement error but also may reflect having too few dimensions 2 versions, Young's S-stress (based on squared distances) and the Kruskal's stress (a.k.a., stress formula 1 or stress 1, based on distances)
SPSS generates both but uses S-stress as the criterion for stopping the iterations by which it resets point coordinates to reduce stress, when the improvement in S-stress is .001 or less for that iteration. (The Model dialog lets the researcher adjust this cut-off; if "0" is entered, the algorithm computes 30 iterations)
Overall stress is the SPSS label for average stress in RMDS models (because RMDS has more than one matrix). Average stress is the square root of the mean of squared Kruskal stress values.
Stress: badness of fit

Overall stress
Stress and Fit Measures Normalized Raw Stress Stress-I Stress-II S-Stress Dispersion Accounted For (D.A.F.) .02639 .16246a Object .38154a Savor .05219b .97361 Surprise Love Exhaustion Tucker's Coefficient of Congruence .98672 Wrong Anger Pulling PROXSCAL minimizes Normalized Raw Stress. Meets Revulsion a. Optimal scaling factor = 1.027. b. Optimal scaling factor = 1.030. Pain KnowFear Sleep Mean .0187 .0181 .0456 .0357 .0306 .0264 .0187 .0181 .0456 .0357 .0306 .0264 .0229 .0292 .0183 .0279 .0648 .0171 .0077 .0229 .0292 .0183 .0279 .0648 .0171 .0077 Grief
Decomposition of stress table Individual stress values

Decomposition of Normalized Raw Stress Source SRC_1 .0064 Mean .0064
Common space
Final Coordinates Dimension 1 Grief Savor Surprise Love Exhaustion Wrong Anger Pulling Meets Revulsion Pain KnowFear Sleep .223 -.371 -.854 -.625 -.016 .514 .991 -.308 -.699 .328 .271 .796 -.250 2 -.301 .113 .485 -.067 -.463 .021 .180 .386 .183 -.386 -.078 .632 -.707
MDS perceptual map
Shepard Diagram
R2 greater than 0.6
Plot of transformation
As a practical strategy, we may start with a weaker assumption, but as soon as we find, as a result of the analysis, that a stronger measurement assumption can be justified, we switch to the stronger assumption. In this way we can get more reliable results while avoiding unaffordable scale level assumptions.
Contrasts
Three methods of analysis are closely related to MDS. These are principal component analysis (PCA), correspondence analysis (CA) and cluster analyis. In this section we will give a short description of PCA, CA and cluster analysis and their relation to MDS. 6.1. Principal Components Analysis Principal components analysis or PCA is performed on a matrix A of n entities observed w.r.t. p variables. The aim is to search for new variables, called principal components, which are based on a linear combination of the original variables and this in a way that they account for most of the variation in the original variables. In metric CMDS a matrix of distances D between the n entities is given and the aim is to find a low-dimensional configuration of the entities such that the distances are approximated in a least-squares sense. When these distances are Eulidean distances, the coordinates contained in X do represent the principal coordinates which would be obtained when doing PCA on A. This approach is called principal coordinates analysis as well as classical scaling. A more detailed account of this correspondence can be found in Everitt and Rabe-Hesketh (1997).
Applications of MDS
6.2. Correspondence Analysis Correspondence analysis is classically used on a two-way contingency table with the aim to visualize the relations (i.e. deviations from statistical independence) between the row and column categories. The same is done by the unfolding models: subjects (row-categories) and objects (column-categories) are visualized in a way that the order of the distances between a subject-point and the objectpoints reflects the preference-ranking of the subject. The measure of "proximity" used in CA is the chi-square distance between the profiles. A short description of CA and its relation to MDS can be found in Borg and Groenen (1997). 6.3. Cluster Analysis Cluster analysis models or ultrametric tree models, are equally applicable to proximity data including two-way (asymmetric) square and rectangular data as well as three-way two-mode data. The main difference with the MDS models is that most models for cluster analysis lead to a hierarchical structure. The dissimilarities are approached by path distances under a number of restrictions. The path distances are looked for in a way that minimizes the sum of squared errors:
Take Note
Analysis is not straightforward: many algorithms input into SPSS program (Proxscal/Alscal) which makes it seem easy to compute, but interpretation needs to take into account process the data underwent in order to elicit a better understanding More dimensions, greater complexity of analysis
References
http://www.mathpsyc.uni-bonn.de/doc/delbeke/delbeke.htm http://repub.eur.nl/res/pub/1274/ei200415.pdf http://forrest.psych.unc.edu/teaching/p208a/mds/mds.html http://www.analytictech.com/borgatti/mds.htm http://www.terry.uga.edu/~pholmes/MARK9650/Classnotes4.pdf http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/index. jsp?topic=%2Fcom.ibm.spss.statistics.help%2Fsyn_proxscal_overvie w.htm http://forrest.psych.unc.edu/research/alscal.html http://www.statsoft.com/textbook/multidimensional-scaling/ http://faculty.chass.ncsu.edu/garson/PA765/mds.htm#ALSCAL http://takane.brinkster.net/Yoshio/c045.pdf

Multidimensional Scaling

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multidimensional Scaling

Uploaded by

Copyright:

Available Formats

Multidimensional scaling

Yee Jean 12524752 Andrea12524807 Mohan 12524729

M&M KitKat M&M Snicker Pocky Mentos TicTac

How MDS works

Assign a location to each item in N-dimensional space, where N is specified a priori

Data collection: How raw proximities are obtained?

Measurement characteristics of the data

MDS Models (proximity matrix)

MDS model summary

How MDS works (The iterative MDS-algorithm)

SPSS Case Study

Dissimilarity: difference in emotional expression or content

SPSS options PROXSCAL ALSCAL

alternating least squares scaling

SPSS does not allow you to use proximities directly

Proximity matrix: Input data

Testing for validity/reliability

Stress: badness of fit

Decomposition of stress table Individual stress values

MDS perceptual map

R2 greater than 0.6

You might also like