You are on page 1of 51

MGT 205- Introduction to

Statistics
Lecture 2
Methods for Describing Sets of Data

İstanbul Şehir University


Gökçen Arkalı Olcay
Fall 2016-2017

Fall 2016-2017 MGT 205 1


Learning Objectives
1. Describing qualitative data
2. Graphical methods for describing quantitative data
3. Numerical measures of central tendency
4. Numerical measures of variability
5. Using the mean and standard deviation to describe data
6. Numerical measures of relative standing
7. Methods for detecting outliers: box plots and z-scores
8. Graphing bivariate relationships
9. The time series plot
10. Distorting the truth with descriptive techniques

Fall 2016-2017 MGT 205 2


Learning Objectives
1. Describe data using tabular methods and graphs
 Describing Qualitative Data
 Describing Quantitative Data
2. Describe data using numerical measures
 Summation Notation
 Numerical Measures of Central Tendency
 Numerical Measures of Variability
 Interpreting the Standard Deviation
 Numerical Measures of Relative Standing

Fall 2016-2017 MGT 205 3


Describing Qualitative Data

Fall 2016-2017 MGT 205 4


Key Terms
A class is one of the categories into which qualitative
data can be classified.
The class frequency is the number of observations in
the data set falling into a particular class.
The class relative frequency is the class frequency
divided by the total numbers of observations in the
data set.
The class percentage (percent frequency) is the class
relative frequency multiplied by 100.

Fall 2016-2017 MGT 205 5


Frequency Distribution

A frequency distribution is a tabular summary


of data showing the number (frequency) of
items in each of several non-overlapping
classes.

Fall 2016-2017 MGT 205 6


Data on 40 Best-Paid Executives
(Forbes, 2011)
CEO Company Salary ($ Age Degree
millions) None
Lauren, Ralph Polo Ralph 43 71 None Bachelors
Lauren
Bachelors
Schultz, Howard Starbucks 29.73 57 Bachelor’s
None
Chambers, John Cisco 37.90 61 MBA MBA
Systems PhD
Larsen, Marshall Goodrich 22.43 62 Master’s MBA
Buckley, George 3M 23.22 64 PhD Bachelors
Masters
Solomon, Howard Forest Labs 27.10 83 Law
Law
Hemsley, Stephen United 101.96 58 Bachelor’s
Health PhD
Group ...
Fall 2016-2017 MGT 205 7
Data on 40 Best-Paid Executives
(Forbes, 2011)
None
Bachelors
Bachelors
None Frequency Distribution:
MBA
Degree Frequency
PhD
None 4
MBA
Bachelors 13
Bachelors
Masters 5
Masters
MBA 15
Law
Law 1
PhD
PhD 2
...
TOTAL 40

Fall 2016-2017 MGT 205 8


Summary Table (Frequency
Distribution)
1. Lists categories & number of elements in category
2. Obtained by tallying responses in category
3. May show frequencies (counts), % or both

Row Is Major Count Tally:


Category
Accounting 130 |||| ||||
|||| ||||
Economics 20
Management 50
Total 200
Fall 2016-2017 MGT 205 9
Relative Frequency and Percent
Frequency Distribution
Relative Frequency:
Frequency of the class
Re lative frequency of a class 
n
n: sample size

Percent Frequency:
Percent frequency  Re lative frequency  100

Fall 2016-2017 MGT 205 10


Data on 40 Best-Paid Executives
(Forbes, 2011)
Degree Frequency Relative Percent
Frequency Frequency
(Class
Percentage)
None 4 0.10 10
Bachelors 13 0.325 32.5
Masters 5 0.125 12.5
MBA 15 0.375 37.5
Law 1 0.025 2.5
PhD 2 0.05 5
TOTAL 40 1.00 100 %

Fall 2016-2017 MGT 205 11


Graphical Methods for Describing
Qualitative Data

• Bar graph
• Pie chart
• Pareto diagram

Fall 2016-2017 MGT 205 12


Bar Graph
Equal Bar
Widths Bar Height
Shows
Frequency

Percent Frequency or %
Used
Also

Vertical Bars
Zero Point for Qualitative
Variables

Fall 2016-2017 MGT 205 13


Pie Chart
Majors
1. Shows breakdown of
total quantity into
Mgmt.
categories Econ.
36
25%
2. Useful for showing 10%
°
relative differences
3. Angle size Acct.
• (360°)(percent) 65%
(360°) (10%) =
36°
Fall 2016-2017 MGT 205 14
Pareto Diagram
Like a bar graph, but with the categories arranged by
height in descending order from left to right.
Frequency

Percent
Used
Also

Fall 2016-2017 MGT 205 15


Data on 40 Best-Paid Executives
(Forbes, 2011)
16
14
12
10
8
6
4 PhD Percent frequency
2 5%

0
MBA Bachelor's Master's None PhD None
10%
MBA
Bar graph Master's
39%
13%

Bachelor's
33%

Pie chart of degree


Fall 2016-2017 MGT 205 16
40 Best-Paid Executives (Forbes,
2011)
16

14

12

10

0
MBA Bachelor's Master's None PhD Law

Pareto diagram

Fall 2016-2017 MGT 205 17


Summary
Bar graph: The categories (classes) of the qualitative
variable are represented by bars, where the height of
each bar is either the class frequency, class relative
frequency, or class percentage.
Pie chart: The categories (classes) of the qualitative
variable are represented by slices of a pie (circle). The
size of each slice is proportional to the class relative
frequency.
Pareto diagram: A bar graph with the categories
(classes) of the qualitative variable (i.e., the bars)
arranged by height in descending order from left to
right.
Fall 2016-2017 MGT 205 18
Example 1- Describing Qualitative
Data
A partial relative frequency distribution is given.
Class Relative Frequency
A 0.22
B 0.18
C 0.4
D
a. What is the relative frequency of class D?
b. The total sample size is 200. What is the frequency of class
D?
c. Show the frequency distribution.
d. Show the percent frequency distribution.

Fall 2016-2017 MGT 205 19


Tabular and Graphical Methods for
Describing Quantitative Data

Fall 2016-2017 MGT 205 20


Summarizing Quantitative Data
1. Determine number of classes
– General guideline: use between 5 and 20 classes
2. Compute class intervals (width)
3. Determine class boundaries (limits)
– Choose the limits so that each item belongs to one and only one
class
4. Count observations & assign to classes

L arg est data value  Smallest data value


Approx. Class Width 
Number of classes

Fall 2016-2017 MGT 205 21


Percentages of Revenues Spent on
Research and Development (R&D)
13.5 9.5 8.2 6.5 1) Number of classes=9
8.4 8.1 6.9 7.5 2) Approximate class width=(13.5-5.2)/9=0.92 =>
10.5 13.5 7.2 7.1 1!
3) Class limits: (5-6), (6-7), (7-8), (8-9), (9-10),
9.0 9.9 8.2 13.2
(10-11), (11-12), (12-13), (13-14)
9.2 6.9 9.6 7.7
Class Interval Frequency
9.7 7.5 7.2 5.9
6.6 11.1 8.8 5.2 5.0-6.0 3
Frequency 6.0-7.0 9
10.6 8.2 11.3 5.6 Distribution: 7.0-8.0 11
10.1 8.0 8.5 11.7 8.0-9.0 9
7.1 7.7 9.4 6.0 9.0-10.0 8

8.0 7.4 10.5 7.8 10.0-11.0 4


11.0-12.0 3
7.9 6.5 6.9 6.5
12.0-13.0 0
6.8 9.5 13.0-14.0 3
Fall 2016-2017 MGT 205 TOTAL 50 22
Relative Frequency and Percent
Frequency Distributions for R&D Data
Class Interval Frequency Relative Percent
Frequency Frequency
5.0-6.0 3 0.06 6
6.0-7.0 9 0.18 18
7.0-8.0 11 0.22 22
8.0-9.0 9 0.18 18
9.0-10.0 8 0.16 16
10.0-11.0 4 0.08 8
11.0-12.0 3 0.06 6
12.0-13.0 0 0.00 0
13.0-14.0 3 0.06 6
TOTAL 50 1.00 100

Fall 2016-2017 MGT 205 23


Histogram for R&D Data
Class Interval Frequency

5.0-6.0 3
6.0-7.0 9
7.0-8.0 11
Frequency

8.0-9.0 9
9.0-10.0 8
10.0-11.0 4
11.0-12.0 3
12.0-13.0 0
13.0-14.0 3
TOTAL 50

5 13
Bars
Fall 2016-2017 MGT 205 24
Touch
Dot Plot
1. Horizontal axis is a scale for the quantitative variable,
e.g., percent.
2. The numerical value of each measurement is located
on the horizontal scale by a dot.

Fall 2016-2017 MGT 205 25


Stem-and-Leaf Display
Data: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41

2 144677
Divide each observation
into stem value and leaf 26
3 028
value
• Stems are listed in
4 1
order in a column
• Leaf value is placed in
corresponding stem
row to right of bar
Fall 2016-2017 MGT 205 26
Summation Notation

Fall 2016-2017 MGT 205 27


Summation Notation
Most formulas we use require a summation of numbers.

åx i
i =1

Sum the measurements on the variable that appears to the


right of the summation symbol, beginning with the 1st
measurement and ending with the nth measurement.

Fall 2016-2017 MGT 205 28


Summation Notation
For the data x1 = 5, x2 = 3, x3 = 8, x4 = 5, x5 = 4
5

å i =?
x 2

i=1

åi 1 2 3 4 5
x 2
= x 2
+ x 2
+ x 2
+ x 2
+ x 2

i =1

=5 +3 +8 +5 +4
2 2 2 2 2

= 25 + 9 + 64 + 25 + 16 = 139
Fall 2016-2017 MGT 205 29
Numerical Measures
of Central Tendency

Fall 2016-2017 MGT 205 30


Thinking Challenge

$400,000

$70,000

$50,000 ... employees cite low pay --


most workers earn only
$30,000 $20,000.
... President claims average
$20,000 pay is $70,000!

Fall 2016-2017 MGT 205 31


Two Characteristics
Numerical methods measure two characteristics
of data:
1) The central tendency of the set of
measurements–that is, the tendency of the data to
cluster, or center, about certain numerical values.

Central Tendency
(Location)

Fall 2016-2017 MGT 205 32


Two Characteristics
2) The variability of the set of measurements–
that is, the spread of the data.

Variation
(Dispersion)

Fall 2016-2017 MGT 205


33
Mean
1. Most common measure of central tendency
2. Acts as ‘balance point’
3. Affected by extreme values (‘outliers’)
4. Denoted x where

n
x i x 1  x 2 … x
i 1 n
x  
n n

Fall 2016-2017 MGT 205 34


Standard Notation

Measure Sample Population


Mean x 

Size n N

Fall 2016-2017 MGT 205 35


Mean Example
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
n

x i x1x2 x x x x6
i 1 3 4 5
x  
n 6
10 .3  4.9  8.9  11.7  6.3  7.7

6
 8.30
Fall 2016-2017 MGT 205 36
Median
1. Measure of central tendency
2. Middle value in ordered sequence
• If n is odd, middle value of sequence
• If n is even, average of two middle values
3. Position of median in sequence
n 1
Positioning Point 
2
4. Not affected by extreme values
Fall 2016-2017 MGT 205 37
Median Example
Odd-Sized Sample
• Raw Data: 24.1 22.6 21.5 23.7 22.6
• Ordered: 21.5 22.6 22.6 23.7 24.1
• Position: 1 2 3 4 5

n 1 5 1
Positioning Point   3
2 2
Median  22 .6
Fall 2016-2017 MGT 205 38
Median Example
Even-Sized Sample
• Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• Ordered: 4.9 6.3 7.7 8.9 10.3 11.7
• Position: 1 2 3 4 5 6

n 1 6 1
Positioning Point    3.5
2 2
7.7  8.9
Median   8.30
2
Fall 2016-2017 MGT 205 39
Mean and Median for the 50 R&D
Percentages

Median = 8.05 Mean = 8.492


Mean is larger than the median. The data are skewed to the right- there are
more extreme measurements in the right tail of the distribution than in the left
tail.
Fall 2016-2017 MGT 205 40
Detecting Skewness by Comparing
the Mean and the Median
A data set is said to be skewed if one tail of the
distribution has more extreme observations
than the other tail.
– If the data set is skewed to the right, then typically
the median is less than the mean
– If the data set is symmetric, the mean equals the
median
– If the data set is skewed to the left, then typically
the mean is less than the median
Fall 2016-2017 MGT 205 41
Detecting Skewness by Comparing
the Mean and the Median
Skew: The extent to which a distribution is symmetric
or has a tail.

Left-Skewed Symmetric Right-Skewed


Mean Median Mean = Median Median Mean

Fall 2016-2017 MGT 205 42


Mode
1. Measure of central tendency
2. Value that occurs most often in the data set
3. Not affected by extreme values
4. May be no mode or several modes
5. May be used for quantitative or qualitative
data

Fall 2016-2017 MGT 205 43


Mode Example
• No Mode
Raw Data: 10.3 4.9 8.9 11.7 6.3 7.7
• One Mode
Raw Data: 6.3 4.9 8.9 6.3 4.9 4.9
• More Than One Mode
Raw Data: 21 28 28 41 43 43

Fall 2016-2017 MGT 205 44


Summary of
Central Tendency Measures

Measure Formula Description


Mean x i / n Balance Point
Median (n+1) Middle Value
Position
2 When Ordered
Mode none Most Frequent

Fall 2016-2017 MGT 205 45


Example 2- Finding the Measures
of Central Tendency

You are a financial analyst for Prudential-Bache


Securities. You have collected the following closing
stock prices of new stock issues: 17, 16, 21, 18, 13,
16, 12, 11.
Describe the stock prices in terms of central tendency.

Fall 2016-2017 MGT 205 46


Central Tendency Solution
Mean
n

x i x 1  x 2 … x
x  i 1 8

n 8
17  16  21  18  13  16  12  11

8
 15 .5
Fall 2016-2017 MGT 205 47
Central Tendency Solution
Median
• Raw Data: 17 16 21 18 13 16 12 11
• Ordered: 11 12 13 16 16 17 18 21
• Position: 1 2 3 4 5 6 7 8
n 1 8 1
Positioning Point    4.5
2 2
16  16
Median   16
2
Fall 2016-2017 MGT 205 48
Central Tendency Solution

Mode
Raw Data: 17 16 21 18 13 16 12 11

Mode = 16

•Fall 2016-2017 •MGT 205 •49


Example 3-Comparing the Mean,
Median, and Mode− CEO Salaries
Refer to Forbes magazine’s “Executive Compensation
Scoreboard,” which lists the total annual pay for CEOs
at the 500 largest U.S. firms. The data for the 2011
scoreboard includes the quantitative variables total
annual pay (in millions of dollars) and age (in years).
Find the mean, median and mode for both of these
variables. Which measure of central tendency is better
for describing the distribution of total annual pay?
Age?

Fall 2016-2017 MGT 205 50


Example 3-Comparing the Mean,
Median, and Mode− CEO Salaries
CEO Pay Age
($ Million) (years)
Mean 9.247 56.62
Median 6.100 56
Mode 0 56

•Fall 2016-2017 •MGT 205 •51

You might also like