IMS 504-Week 4&5 New

IMS 502
Information Analysis for Decision Making

Descriptive Statistics
Content
• Measure of Location
• Measure of Spread
• Measure of Shape
Learning Outcomes
• After completing this chapter, you should be
able to:
– Able to conduct appropriate measure of location,
measure of spread & measure of shape
Concept of Descriptive
Statistics
Measure of Shape
Measure of Spread Measure of Location

Describing Data Numerically
(Variability) (Central Tendency)
Statistics
Measure of Location
(Central Tendency)
• Descriptive information about the single numerical value that is
considered to be the most typical of the values of a quantitative variable.
• A measure of central tendency is a measure which indicates where the
middle of the data is.
• Three common measures of central tendency are:
Mode Mean
Median
Measures of Central
Tendency
• Statistics to represent the ‘centre’ of a
distribution
– Mode (most frequent)
– Median (50th percentile)
– Mean (average)
• Choice of measure dependent on
– Type of data
– Shape of distribution (esp. skewness)
Measures of Central
Tendency
Measure of Central Tendency
Level of Mode Median Mean

Measurement
Nominal X
Ordinal X X X
Interval X X X
Ratio X? X X
Measures of Central
Tendency
• Choosing a measure of central tendency
Use the Mode When : 1. The variable is measured at the nominal level.
2. You want a quick and easy measure for ordinal and
interval-ratio variables.
3. You want to report the most common score.
Use the Median When : 1. The variable is measured at the ordinal level.
2. A variables measured at the interval-ratio level has a
highly skewed distribution.
3. You want to report the central score. The median
always lies at the exact center of a distribution.
Use the Mean When : 1. The variable is measured at the interval-ratio level
(except when the variable is highly skewed).
2. You want to report the typical score. The mean is “the
fulcrum that exactly balances all of the scores.”
3. You anticipate additional statistical analysis.
Measures of Central
Tendency
• Mean
– Is the average of the data
• The
N
Population Mean:
µ= ∑X i
which is usually unknown, then we use the
i =1
N
sample mean to estimate or approximate it.
• The Sample Mean:
n
=
x ∑ x
i =1
n
i
Measures of Central
Tendency
Example:
Here is a random sample of size 10 of ages, where
χ 1 = 42, χ 2 = 28, χ 3 = 28, χ 4 = 61, χ 5 = 31,
χ 6 = 23, χ 7 = 50, χ 8 = 34, χ 9 = 32, χ 10 = 37.
x = (42 + 28 + … + 37) / 10 = 36.6

Measures of Central
Tendency
• Properties of the Mean
– Uniqueness:
Uniqueness For a given set of data there is
one and only one mean.
– Simplicity:
Simplicity It is easy to understand and to
compute.
– Affected by extreme values:
values Since all values
enter into the computation.
Measures of Central
Tendency
Example:
Assume the values are 115, 110, 119, 117, 121 and 126.
The mean = 118.
But assume that the values are 75, 75, 80, 80 and 280.
The mean = 118, a value that is not representative of the set of data as
a whole.
Statistics
• Median
– When ordering the data, it is the observation that
divide the set of observations into two equal parts
such that half of the data are before it and the other
are after it.
– The median is the center point in a set of numbers
(50% above, 50% below)
Statistics
• Median
– Check to see which of the following two
rules applies:
• Rule One
If n is odd, the median will be the middle of
observations.
It will be the (n+1)/2 th ordered observation.
When n = 11, then the median is the 6th observation.
Example:
Three is the median for the numbers 1, 1, 3, 4, 9
Statistics
• Median
– Check to see which of the following two
rules applies:
• Rule Two
If n is even, there are two middle observations. The median
will be the mean of these two middle observations.
It will be the (n+1)/2 th ordered observation.
When n = 12, then the median is the 6.5th observation,
which is an observation halfway between the 6th and 7th
ordered observation.
Example:
23, 28, 28, 31, 32, 34, 37, 42, 50, 61.
Since n = 10, then the median is the 5.5th observation, i.e. =
(32+34)/2 = 33
Measures of Central
Tendency
• Properties of the Median
– Uniqueness:
Uniqueness For a given set of data there is
one and only one median.
– Simplicity:
Simplicity It is easy to calculate.
– It is not affected by extreme values as is the
mean.
Statistics
• Mode
– The mode is simply the most frequently
occurring number.
– If all values are different there is no mode.
mode
– Sometimes, there are more than one mode. mode
Measures of Central
Tendency
Example:
Assume the values are 23, 28, 28, 31, 32, 34, 37, 42, 50, 61.
The median = 28 (repeated two times).

Measures of Central
Tendency
• Properties of the Mode
– Sometimes, it is not unique.
– It may be used for describing qualitative
data.
Measures of Central
Tendency
• The only procedure in SPSS that will produce all three
commonly used measure of location is Frequency.
• To begin:
Working Example(Pg. 59)
• One hundred tennis players participated
in a serving competition. Gender and
number of aces were recorded for each
player. The data can be found in
Work4.sav on the iLearn web site that
accompanies this title.
• Follow steps 1, 2, 3, 4, 6, 8 & 11.
Exercises
• Use the Frequencies command to get all
three (3) measures of location.
• Write a sentence or two reporting each
measure. You may choose three(3)
variables to report on.
Statistics
Measure of Spread
(Variability/Dispersion)
• Give information on the spread or variability of the data values.
• Measures of deviation from the central tendency.
• Non-parametric/non-normal:
• range, percentiles, min, max
• Parametric:
• standard deviation (SD) & properties of the normal distribution
Statistics
Measure of Spread
(Variability/Dispersion)
• Four common measures of spread are:
Range
Quartiles
Variance
Standard
Deviation
Measures of Spread
Measure of Spread
Level of Range, Min/Max Percentile Standard Deviation

Measurement (SD)
Nominal
Ordinal X
Interval X X X?
Ratio X X X
Measures of Spread
• Range (R)
– The range is the difference between the
highest and lowest scores in a distribution.
– If we examine the marks of the 100 students
above, then we can see that the highest
score was 85 and the lowest was 35.
Therefore, the range is 50 (85 – 35 = 50).
– The range is limited as a means of telling
about the general spread of a group of data,
it does set the boundaries of the scores.
Measures of Spread
• Quartiles (Q)
– The quartiles split the ordered data into four
quarters:
• Q1, or 25th percentile -- 25.0% of the
observations
• Q2, is the median -- 50.0% of the observations
• Q3, or 75th percentile -- 75.0% of the
observations
– The difference between Q3 - Q1 is called the
inter-quartile range, or IQR.
25% 25% 25% 25%

Q1 Q2 Q3
Measures of Spread
• Variance
– The variance is the square of the standard
deviation.
– The lower the variance, the more accurately
the mean represents the scores of all cases
in a distribution of data.
Measures of Spread
• Standard Deviation
– The standard deviation provides the
researcher with an indicator of how scores
for variables are spread around the mean
average.
– The higher the standard deviation, the more
scores around the mean are spread out.
Measures of Spread
• Using SPSS for measure of spread.
• To begin:
• Follow steps 1, 2, 3, 4, 5, 7, 8 & 11.
Exercises
• Choose three(3) variables to work on.
• Write a few sentences summarizing these
tables for each under measurement of
spread.
• Describe the difference (if any).
Statistics
Measure of Shape
• To describes how data are distributed.

• Use the normal curve (combination of mean & standard deviation) to construct
precise descriptive statements.
• Two common measures of shape are:
Kurtosis Skewness
is a measure of whether the data are is a measure of symmetry, or more

peaked or flat relative to a normal precisely, the lack of symmetry.
distribution.
Platykurtic Normal
Leptokurtic Positive Skewness
Negative Skewness
Measure of Shape - Skewness
Mean < Median < Mode Mean = Median = Mode Mode < Median < Mean
Coefficient = Negative Coefficient = 0 Coefficient = Positive
Measure of Shape - Kurtosis
• Data distribution with small
standard deviation.
• Data sets with high kurtosis tend
to have a distinct peak near the
mean, decline rather rapidly, and
have heavy tails.
• The data will cluster around or
close to the Mean.
• Kurtosis, γ2 > 0 (Leptokurtic)
• Data distribution with large
standard deviation.
• Data sets with low kurtosis tend
to have a flat top near the mean
rather than a sharp peak.
• The data will be far away from
the mean.
• Kurtosis, γ2 < 0 (Platykurtic)
Measure of Shape
• Using SPSS for measure of shape.
• To begin:
• Follow steps 1, 2, 3, 4, Click on Skewness
& Kurtosis, 8, 9, 10 & 11.
Exercise
• Choose three(3) variables to work on.
• Write a few sentences summarizing these
tables for each under measurement of
shape.
• Describe the difference (if any).

IMS 504-Week 4&5 New

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IMS 504-Week 4&5 New

Uploaded by

Copyright:

Available Formats

IMS 502

Information Analysis for Decision Making

Measure of Spread Measure of Location

Measure of Central Tendency

Level of Mode Median Mean

x = (42 + 28 + … + 37) / 10 = 36.6

The mean = 118.

The median = 28 (repeated two times).

Level of Range, Min/Max Percentile Standard Deviation

25% 25% 25% 25%

• To describes how data are distributed.

is a measure of whether the data are is a measure of symmetry, or more

Leptokurtic Positive Skewness

You might also like