Professional Documents
Culture Documents
Sets of Data
1. Describe data using graphs
2. Describe data using numerical measures
1/6/2015
1/6/2015
Key Terms
A class is one of the categories into which
qualitative data can be classified.
The class frequency is the number of
observations in the data set falling into a
particular class.
The class relative frequency is the class
frequency divided by the total numbers of
observations in the data set.
The class percentage is the class relative
frequency multiplied by 100.
1/6/2015
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
4
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
5
Summary Table
1. Lists categories & number of elements in category
2. Obtained by tallying responses in category
3. May show frequencies (counts), % or both
Row Is
Category
1/6/2015
Major
Accounting
Economics
Management
Total
Count
130
20
50
200
Tally:
|||| ||||
|||| ||||
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
7
Bar Graph
Percent
Used
Also
Frequency
150
Equal Bar
Widths
Bar Height
Shows
Frequency or %
100
50
0
Acct.
Econ.
Major
Zero Point
1/6/2015
Mgmt.
Vertical Bars
for Qualitative
Variables
8
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
9
Pie Chart
1. Shows breakdown of
total quantity into
categories
2. Useful for showing
relative differences
Majors
Econ.
10%
Mgmt.
25%
36
Acct.
65%
3. Angle size
(360)(percent)
(360) (10%) = 36
1/6/2015
10
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
11
Pareto Diagram
Like a bar graph, but with the categories arranged by
height in descending order from left to right.
Percent
Used
Also
Frequency
150
Equal Bar
Widths
Bar Height
Shows
Frequency or %
100
50
0
Acct.
Mgmt.
Major
Zero Point
1/6/2015
Econ.
Vertical Bars
for Qualitative 12
Variables
Summary
Bar graph: The categories (classes) of the qualitative
variable are represented by bars, where the height of
each bar is either the class frequency, class relative
frequency, or class percentage.
Pie chart: The categories (classes) of the qualitative
variable are represented by slices of a pie (circle). The
size of each slice is proportional to the class relative
frequency.
Pareto diagram: A bar graph with the categories
(classes) of the qualitative variable (i.e., the bars)
arranged by height in descending order from left to
right.
1/6/2015
13
Thinking Challenge
Youre an analyst for IRI. You want to show the
market shares held by Web browsers in 2006.
Construct a bar graph, pie chart, & Pareto diagram
to describe the data.
Browser
Firefox
Internet Explorer
Safari
Others
1/6/2015
Internet
Explorer
Safari
Others
Browser
1/6/2015
15
Internet
Explorer,
81%
1/6/2015
16
Firefox
Safari
Others
Browser
1/6/2015
17
1/6/2015
18
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
19
Dot Plot
1. Horizontal axis is a scale for the quantitative variable,
e.g., percent.
2. The numerical value of each measurement is located
on the horizontal scale by a dot.
1/6/2015
20
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
21
Stem-and-Leaf Display
1. Divide each observation
into stem value and leaf
value
Stems are listed in
order in a column
Leaf value is placed in
corresponding stem
row to right of bar
2 144677
3 028
26
4 1
2. Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41
1/6/2015
22
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
23
Frequency Distribution
Table Steps
1. Determine range
2. Select number of classes
Usually between 5 & 15 inclusive
3. Compute class intervals (width)
4. Determine class boundaries (limits)
5. Compute class midpoints
24
Width
1/6/2015
Midpoint Frequency
15.5 25.5
20.5
25.5 35.5
30.5
35.5 45.5
40.5
Boundaries
Percentage
Distribution
Class
Prop.
Class
15.5 25.5
.3
15.5 25.5
30.0
25.5 35.5
.5
25.5 35.5
50.0
35.5 45.5
.2
35.5 45.5
20.0
1/6/2015
26
Data Presentation
Data
Presentation
Qualitative
Data
Quantitative
Data
Dot
Plot
Summary
Table
Bar
Graph
1/6/2015
Pie
Chart
Pareto
Diagram
Stem-&-Leaf
Display
Frequency
Distribution
Histogram
27
Histogram
Class
15.5 25.5
25.5 35.5
35.5 45.5
Count
5
Frequency
Relative
Frequency
Percent
Freq.
3
5
2
3
Bars
Touch
1
0
0
1/6/2015
15.5
25.5
35.5
45.5
Lower Boundary
55.5
28
Summation Notation
1/6/2015
29
Summation Notation
Most formulas we use require a summation of numbers.
n
i1
1/6/2015
30
Summation Notation
For the data x1 5, x2 3, x3 8, x4 5, x5 4
5
2
2
2
2
2
2
x
x
i 1 2 3 4 5
i1
5 3 8 5 4
25 9 64 25 16 139
2
1/6/2015
31
1/6/2015
32
33
Mean
Median
Mode
Range
34
1/6/2015
35
36
Mean
The mean is the Average of a
group of numbers
It is helpful to know the mean
because then you can see which
numbers are above and below the
mean
It is very easy to find!
1/6/2015
37
Mean Example
Here is an example test scores for Ms.
Maths class.
82
93 86 97 82
To find the Mean, first you must add up all of the
numbers.
82+93+86+97+82= 440
Now, since there are 5 test scores, we will next
divide the sum by 5.
4405= 88
The Mean is 88!
1/6/2015
38
Median
The Median is the middle value
on the list.
The first step is always to put the
numbers in order.
1/6/2015
39
Median Example
1/6/2015
40
Median Example #2
Now, lets try it with an even number of test scores.
92
86
94
83
72
88
94
41
Mode
The Mode refers to the number that occurs
the most frequently.
Its easy to remember the first two
numbers are the same! MOde and MOst
Frequently!
1/6/2015
42
Mode Example
Here is an list of temperatures for one week.
Mon. Tues. Wed. Thurs. Fri. Sat. Sun.
77
79 83
77
83 77
82
1/6/2015
43
Range
The range is the difference between the
highest and the lowest numbers of the
series.
All we have to do is put the numbers in
order and subtract!
1/6/2015
44
Range Example
Lets look at the temperatures again.
77 77 77 79 82 83 83
45
Dad
34
1/6/2015
Mom
33
Jack
Alex
Katie
1
46
Mean
Here are the ages again
Dad- 34, Mom- 33, Jack- 5, Alex- 5, Katie- 1
What is the Mean?
Remember Mean is the AVERAGE
Try it on your paper and see what you come up
with!
1/6/2015
47
Mean
Remember, to find the mean, we have to first add up
all of the numbers.
34+33+5+5+1= 78
Then, since there are 6 people in the family, we next
divide by 6.
785= 15.6
The Mean in this case is 15.6
1/6/2015
48
Median
Here are the ages again
Dad- 34, Mom- 33, Jack- 5, Alex- 5, Katie- 1
What is the Median?
Remember Median is the MIDDLE NUMBER
Try it on your paper and see what you come up
with!
1/6/2015
49
Median
Remember, to find the mean, we have to first
put all of the numbers in order.
34
33
1/6/2015
50
Mode
Here are the ages again
Dad- 34, Mom- 33, Jack- 5, Alex- 5, Katie- 1
What is the Mode?
Remember Mode is the MOST FREQUENT
Try it on your paper and see what you come up with!
1/6/2015
51
Mode
Remember, to find the mode, we have to
first put all of the numbers in order.
34
33
1/6/2015
52
Range
Here are the ages again
Dad- 34, Mom- 33, Jack- 5, Alex- 5, Katie- 1
What is the Range?
Remember Range is the DIFFERENCE
53
Range
Remember, to find the range, we have to first put
all of the numbers in order.
34
33
54
1/6/2015
55
1/6/2015
56
Numerical Measures
of Central Tendency
1/6/2015
57
Thinking Challenge
Rs 400,000
Rs 70,000
Rs 50,000
Rs 30,000
Rs 20,000
1/6/2015
58
Two Characteristics
The central tendency of the set of
measurementsthat is, the tendency of the data to
cluster, or center, about certain numerical values.
Central Tendency
(Location)
1/6/2015
59
Two Characteristics
The variability of the set of measurementsthat
is, the spread of the data.
Variation
(Dispersion)
1/6/2015
60
Standard Notation
Measure
Sample
Population
Mean
Size
1/6/2015
61
Mean
1.
2.
3.
4.
x
1/6/2015
x i
i 1
x 1 x 2 x
n
62
Mean Example
Raw Data:
x i
i 1
x1x2 x
x6
8.30
1/6/2015
63
Median
1. Measure of central tendency
64
Median Example
Odd-Sized Sample
Raw Data: 24.1 22.6 21.5 23.7 22.6
Ordered: 21.5 22.6 22.6 23.7 24.1
Position:
1
2
3
4
5
n 1 5 1
Positioning Point
3.0
2
2
Median 22 .6
1/6/2015
65
Median Example
Even-Sized Sample
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
Position:
1
2
3
4
5
6
n 1 6 1
Positioning Point
3.5
2
2
7.7 8.9
Median
8.30
2
1/6/2015
66
Mode
1. Measure of central tendency
1/6/2015
67
Mode Example
No Mode
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
One Mode
Raw Data: 6.3 4.9 8.9
41
1/6/2015
28
43
43
68
Thinking Challenge
Youre a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of new
stock issues: 17, 16, 21, 18,
13, 16, 12, 11.
Describe the stock prices
in terms of central
tendency.
1/6/2015
69
x i
i 1
x 1 x 2 x
17 16 21 18 13 16 12 11
8
15 .5
1/6/2015
70
16 16
2
18 13 16 12 11
16 16 17 18 21
4 5 6 7 8
1 8 1
4.5
2
2
16
71
17 16 21 18 13 16 12 11
Mode = 16
1/6/2015
72
Summary of
Central Tendency Measures
Measure
Mean
Median
Mode
1/6/2015
Formula
xi /n
(n +1)
Position
2
none
Description
Balance Point
Middle Value
When Ordered
Most Frequent
73
Percentiles
Measures of central tendency that divide a
group of data into 100 parts
At least n% of the data lie below the nth
percentile, and at most (100 - n)% of the
data lie above the nth percentile
Example: 90th percentile indicates that at
least 90% of the data lie below it, and at
most 10% of the data lie above it
The median and the 50th percentile have
the same value.
Applicable for ordinal, interval, and ratio
data
Not applicable for nominal data
1/6/2015
74
Percentiles: Example
Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
Location of
30
i (8) 2.4
30th percentile:
100
75
Percentiles: Example
Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
Location of
30
i (8) 2.4
30th percentile:
100
76
Quartiles
Measures of central tendency that divide a
group of data into four subgroups
Q1: 25% of the data set is below the first
quartile
Q2: 50% of the data set is below the second
quartile
Q3: 75% of the data set is below the third
quartile
Q1 is equal to the 25th percentile
Q2 is located at 50th percentile and equals
the median
Q3 is equal to the 75th percentile
Quartile values are not necessarily members
of the data set
1/6/2015
77
Quartiles
Q2
Q1
25%
1/6/2015
25%
Q3
25%
25%
78
Quartiles: Example
Ordered array: 106, 109, 114, 116, 121,
122, 125, 129
25
109114
i
(8) 2
Q1
1115
.
Q1
100
Q2
Q3
1/6/2015
50
i
(8) 4
100
116121
Q2
1185
.
2
75
i
(8) 6
100
122125
Q3
1235
.
2
79
Measures of Shape
Skewness
Absence of symmetry
Extreme values in one side of a distribution
Kurtosis
Peakedness of a distribution
Leptokurtic: high and thin
Mesokurtic: normal shape
Platykurtic: flat and spread out
1/6/2015
80
Shape
1. Describes how data are distributed
2. Measures of Shape
Skew = Symmetry
Left-Skewed
Mean Median
1/6/2015
Symmetric
Mean = Median
Right-Skewed
Median Mean
81
Skewness
Negatively
Skewed
1/6/2015
Symmetric
(Not Skewed)
Positively
Skewed
82
Kurtosis
Peakedness of a distribution
Leptokurtic: high and thin
Mesokurtic: normal in shape
Platykurtic: flat and spread out
Leptokurtic
Mesokurtic
1/6/2015
Platykurtic
83
Numerical Measures
of Variability
1/6/2015
84
Range
1. Measure of dispersion
2. Difference between largest & smallest
observations
Range = xlargest xsmallest
3. Ignores how data are distributed
7 8 9 10
Range = 10 7 = 3
1/6/2015
7 8 9 10
Range = 10 7 = 3
85
Variance &
Standard Deviation
1. Measures of dispersion
8 10 12
86
Standard Notation
Measure
Mean
Sample
Population
Standard
Deviation
Variance
Size
1/6/2015
N
87
s2
i1
n 1
x1 x x2 x
L xn x
n 1
n 1 in denominator!
1/6/2015
88
i1
n 1
x1 x x2 x
2
1/6/2015
L xn x
n 1
89
Variance Example
Raw Data:
(x i x )
i 1
n 1
where x
x i
i 1
8.3
6 1
6.368
1/6/2015
90
Thinking Challenge
Youre a financial analyst
for Prudential-Bache
Securities. You have
collected the following
closing stock prices of
new stock issues: 17, 16,
21, 18, 13, 16, 12, 11.
What are the variance
and standard deviation
of the stock prices?
1/6/2015
91
Variation Solution*
Sample Variance
Raw Data: 17 16 21 18 13 16 12 11
n
(x i x )
i 1
n 1
where x
i 1
15 .5
2
17 15 .5 ) (16 15 .5 ) (11 15 .5 )
(
11.14
1/6/2015
x i
8 1
92
Variation Solution*
Sample Standard Deviation
n
s s2
1/6/2015
i1
n 1
11.14 3.34
93
Summary of
Variation Measures
Measure
Range
Standard Deviation
(Sample)
Formula
Description
X largest X smallest
n
x x
Total Spread
Dispersion about
Sample Mean
i1
n 1
Standard Deviation
(Population)
i1
Dispersion about
Population Mean
N
n
Variance
(Sample)
1/6/2015
xi x
i1
n 1
Squared Dispersion
about Sample Mean
94
1/6/2015
95