Professional Documents
Culture Documents
Dr David Field
General Information
• The Research Methods course consists of statistics
lectures, workshop exercises, and laboratory practicals
• Bring calculators to workshops, not mobile phone!
• You should have two handouts for this lecture containing
• Handout 1
– The schedule for Autumn term Psychological Research Methods
PY1PR1
– Details of Assessment for this module
• Handout 2
– Lecture handout – “Describing data”
General Information
• PowerPoint presentations for this lecture series will be
available to download from my web page
– http://www.personal.rdg.ac.uk/~sxs02dtf/home.html
• and also on BlackBoard
• There is additional information in the “notes” sections at
the bottom of the slides that you won’t see projected on the
screen today
• So, no need to write everything down
• The mode can be used with all data types, and is the only
measure applicable to unordered categories
• The mode is the most frequently occurring score, and may
be illustrated with a pie chart
• In the example data set the variable “birthCountry”
contains 15 instances of “France”, 13 instances of “UK”,
and 2 instances of “Germany
UK
France
Germany
Questions to answer at home
• What is the modal birth country for a sample
containing 20 UK, 23 French, 50 Indian, and 50
Chinese?
– What word describes this sample?
Central tendency for ordinal, interval and
ratio level variables
• Before calculating a measure of central
tendency you should first visually inspect the
variable using a frequency histogram
• Histograms are most informative for large
sample sizes of several hundred cases or
more
– but they are still an essential step for small
samples
• The first step in producing a histogram is to
sort the cases in the variable from lowest to
highest
• The second step is to count the frequency of
occurrence of each value
The 30 IQ values from earlier
• 68 72 77 79 82 90 90 96 97
97 97 100 101 101 101 103 103
104 105 105 109 109 109 114 115
117 118 124 134 140
The IQ score 101
occurs 3 times in
the sample
Histogram x axis intervals or “bin sizes”
• In the previous example the interval was equal to
one unit on the IQ scale
• Typically, the interval will be wider than a single
unit of the scale
• Be aware of the interval, because a bad interval
choice can make a histogram misleading
– often every score contained in a variable is slightly
different, so a histogram with very small bin sizes will
just look flat
With the same data,
the interval is now
5 IQ points
Note that the y axis
maximum has now
changed
With the same data,
the interval is now
50 IQ points
Note that the y axis
maximum has now
increased
dramatically
The mean (commonly “average”)
3 13 10.5 7 6 8 8 12 4
3 4 6 7 8 8 10.5 12 13
1 2 3 4 5 6 7 8 9
The median
• Then assign ranking positions in the list and locate the
score corresponding to the middle rank
3 13 10.5 7 6 8 8 12 4
3 4 6 7 8 8 10.5 12 13
1 2 3 4 5 6 7 8 9
The mean IQ in this
sample is 101.9
The median IQ is
102
The mean Extroversion
score in this sample is
36.17
The median is 33
When to choose the median
Mean 22.54
Median 23
The range
• The simplest measure of dispersion is obtained
by subtracting the minimum score from the
maximum score
– French sub-sample attitudeEurope has a range of 22
– UK sub-sample attitudeEurope has a range of 31
• Reporting the mean and the range is adequate
as a way of comparing UK and French attitudes
to Europe in this sample
• But the range fails to capture dispersion properly
in some cases, which is why the standard
deviation is normally preferred
– At home, find out what the weaknesses of the range as
a measure of dispersion are
The standard deviation
• 1 – 6 = -5, 4 – 6 = -2…………………….11 – 5 = 5
• Intuitively, the mean of the deviation scores will be
a measure of the amount of variation in the sample
But the mean deviation is always zero because the
positives deviations exactly cancel the negative ones
The standard deviation
• The negative signs are removed by squaring the deviation
scores
• 22 = 4, -22 = 4, 32 = 9, -32 = 9, -42 = 16 etc
• An important statistic called the variance is obtained by
assessing the central tendency in the squared deviation
scores
• Sum the squared deviations
– The squaring process increases the relative contribution of scores
that are far from the mean to the variance, compared to those
scores that are close to the mean
• To calculate the variance you divide the sum of squared
deviations by the number of original scores minus 1
The standard deviation
scores 1 4 5 6 9 11
deviations -5 -2 -1 0 3 5
squared 25 4 1 0 9 25
deviations
• The sum of the squared deviations is 64
• The mean deviation (variance) is therefore
– 64 /(6 – 1) = 12.8
• If the units of the scores is Kg, what is the units of
the variance?
The standard deviation
scores 1 4 5 6 9 11
deviations -5 -2 -1 0 3 5
squared 25 4 1 0 9 25
deviations
• The sum of the squared deviations is 64
• The mean deviation (variance) is therefore
– 64 /(6 – 1) = 12.8
• If the units of the scores is Kg, what is the units of
the variance?
The standard deviation
• To convert the variance back into units we can
understand intuitively we take the square root of
the variance and call it the standard deviation
– In the worked example the square root of 12.8 is 3.58
• The standard deviation (SD) is in the same units
as the sample mean, so, for example, you can
write that the mean weight of adult domestic cats
in the sample is 5.0 Kg (SD 1.0 Kg)
• If the population of cat weights is normally
distributed then 68% of cats will weigh 5.0 Kg +/-
one SD from the mean
– 68% of cats weigh between 4Kg and 6Kg
Mean 22.20
SD 6.5
Mean 22.54
SD 8.7
List of questions to answer at home
• Variable
• Level of measurement
– Categorical
– Ordinal
– Continuous
• Interval
• Ratio
• Measures of central tendency
– Mode
– Mean
– Median
• Frequency histogram
– Bin sizes
• Measures of dispersion
– Range
– Variance
– Standard deviation
Variance (s2) formula
The square (2) of the average difference between each individual score and
the mean for that sample Each score
in sample
Mean of
sample
Formula:
s 2
(X X ) 2
N 1
The sum of.. Number of
scores in
sample
minus 1
Standard deviation formula
Formula:
s
(X X ) 2
N 1
Step 1. Calculate the variance
Step 2. Take the square root of the variance