Professional Documents
Culture Documents
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
What is Statistics?
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
What is Statistics?
Data Information
Do some Interpret
statistical results
calculations
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Descriptive Statistics
Descriptive statistics deals with methods of organizing,
summarizing, and presenting data in a convenient and
informative way.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Descriptive Statistics
Another form of descriptive statistics uses numerical
techniques to summarize data.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Inferential statistics
Inferential statistics is a body of methods used to draw
conclusions or inferences about characteristics of populations
based on sample data. The population in question in this case
is the soft drink consumption of the university's 50,000
students. The cost of interviewing each student would be
prohibitive and extremely time consuming. Statistical
techniques make such endeavors unnecessary. Instead, we can
sample a much smaller number of students (the sample size is
500) and infer from the data the number of soft drinks
consumed by all 50,000 students. We can then estimate annual
profits for Cola.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Key Statistical Concepts
Population
a population is the group of all items of interest to
a statistics practitioner.
frequently very large; sometimes infinite.
E.g. All cola users
Sample
A sample is a set of data drawn from the
population.
Potentially very large, but less than the population.
E.g. a sample of drinkers
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Key Statistical Concepts
Parameter
A descriptive measure of a population.
Statistic
A descriptive measure of a sample.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Key Statistical Concepts
Population Sample
Subset
Statistic
Parameter
Populations have Parameters,
Samples have Statistics.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Descriptive Statistics
are methods of organizing, summarizing, and presenting
data in a convenient and informative way. These methods
include:
Graphical Techniques (Chapter 2), and
Numerical Techniques (Chapter 4).
The actual method used depends on what information we
would like to extract. Are we interested in
measure(s) of central location? and/or
measure(s) of variability (dispersion)?
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Inferential Statistics
Descriptive Statistics describe the data set thats being
analyzed, but doesnt allow us to draw any conclusions or
make any interferences about the data. Hence we need
another branch of statistics: inferential statistics.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Statistical Inference
Statistical inference is the process of making an estimate,
prediction, or decision about a population based on a sample.
Population
Sample
Inference
Statistic
Parameter
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Types of Data
Cross-sectional data
Data collected by recording a characteristic of many subjects
at the same point in time, or without regard to differences in
time.
Subjects might include individuals, households, firms,
industries, regions, and countries.
LO 1.3
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Types of Data
LO 1.3
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
Scales of Measure
- Nominal
Qualitative
- Ordinal
- Ratio Quantitative
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variables and Scales of Measurement
LO 1.4
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Hierarchy of Data
Ratio
Values are real numbers.
All calculations are valid.
Data may be treated as ordinal or nominal.
Ordinal
Values must represent the ranked order of the data.
Calculations based on an ordering process are valid.
Nominal
Values are the arbitrary numbers that represent categories.
Only calculations based on the frequencies of occurrence are valid.
Data can not be treated as ordinal or ratio.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Graphical & Tabular Techniques for Nominal Data
The only allowable calculation on nominal data is to count
the frequency of each value of the variable.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 2.1 Work Status in the GSS 2012 Survey
[GSS2012*] In Chapter 1 we briefly introduced the General Social Survey.
In the 2012 survey respondents were asked the following question.
Last week were you working full time, part time, going to school, keeping
house, or what? The responses were
1. Working full time
2. Working part time
3. Temporarily not working
4. Unemployed, laid off
5. Retired
6. School
7. Keeping house
8. Other
The responses were recorded using the codes 1, 2, 3, 4, 5, 6, 7, and 8,
respectively.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Frequency and Relative Frequency Distributions
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Nominal Data (Frequency)
Bar Chart
1000
912
900
800
700
600
500
400 357
300
226 210
200
104
100 70 54
40
0
1 2 3 4 5 6 7 8
WRKSTAT
6, 3.5%
1, 46.2%
5, 18.1%
4, 5.3%
3, 2.0%
2, 11.5%
Pie Chart
8, 2.7%
7,
6, 3.5% 10.6%
1, 46.2%
5, 18.1%
4, 5.3% 2, 11.5%
3, 2.0%
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Describing the Relationship between Two Nominal Variables
To describe the relationship between two nominal variables, we must
remember that we are permitted only to determine the frequency of the
values. As a first step we need to produce a cross-classification table,
which lists the frequency of each combination of the values of the two
variables
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Problem
One Chocolate manufacturing company sells quality chocolate products at
its plant and retail stores. Two years ago, the company developed a Web
site and began selling its products over the Internet. Web site have
exceeded the companys expectations, and management is now considering
strategies to increase sales even further. To learn more about the Web site
customers, a sample of 50 Chocolate transactions was selected from the
previous months sales.
Data showing
the day of the week each transaction was made,
the type of browser the customer used,
the time spent on the Web site,
the number of Web site pages viewed,
the amount spent by each of the 50 customers.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Box Plot
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 3.1
Following deregulation of telephone service, several new
companies were created to compete in the business of
providing long-distance telephone service. In almost all
cases these companies competed on price since the service
each offered is similar. Pricing a service or product in the
face of stiff competition is very difficult. Factors to be
considered include supply, demand, price elasticity, and the
actions of competitors. Long-distance packages may employ
per-minute charges, a flat monthly rate, or some combination
of the two. Determining the appropriate rate structure is
facilitated by acquiring information about the behaviors of
customers and in particular the size of monthly long-distance
bills.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 3.1
As part of a larger study, a long-distance company wanted to
acquire information about the monthly bills of new
subscribers in the first month after signing with the
company. The companys marketing manager conducted a
survey of 200 new residential subscribers wherein the first
months bills were recorded. These data are stored in file
Xm03-01. The general manager planned to present his
findings to senior executives. What information can be
extracted from these data?
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 3.1
We have chosen eight classes defined in such a way that each
observation falls into one and only one class. These classes are defined
as follows:
Classes
Amounts that are less than or equal to 15
Amounts that are more than 15 but less than or equal to 30
Amounts that are more than 30 but less than or equal to 45
Amounts that are more than 45 but less than or equal to 60
Amounts that are more than 60 but less than or equal to 75
Amounts that are more than 75 but less than or equal to 90
Amounts that are more than 90 but less than or equal to 105
Amounts that are more than 105 but less than or equal to 120
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Example 3.1
Histogram
80
70
60
Frequency
50
40
30
20
10
0
15 30 45 60 75 90 105 120
Bills
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Interpret
(18+28+14=60)200 = 30%
about half (71+37=108)
i.e. nearly a third of the phone bills
of the bills are small,
are $90 or more.
i.e. less than $30
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Building a Histogram
1) Collect the Data
2) Create a frequency distribution for the data
How?
a) Determine the number of classes to use
How?
Refer to table 3.2:
With 200 observations,
we should have
between 7 & 10
classes
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Building a Histogram
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Building a Histogram
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Shapes of Histograms
Symmetry
A histogram is said to be symmetric if, when we draw a
vertical line down the center of the histogram, the two sides
are identical in shape and size:
Frequency
Frequency
Frequency
Variable Variable Variable
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Shapes of Histograms
Skewness
A skewed histogram is one with a long tail extending to
either the right or the left:
Frequency
Frequency
Variable Variable
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Shapes of Histograms
Modality
A unimodal histogram is one with a single peak, while a
bimodal histogram is one with two peaks:
Bimodal
Unimodal
Frequency
Frequency
Variable Variable
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Shapes of Histograms
Bell Shape
A special type of symmetric unimodal histogram is one that
is bell shaped:
Frequency
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Ogive
(pronounced Oh-jive) is a graph of
a cumulative frequency distribution.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Relative Frequencies
For example, we had 71 observations in our first class
(telephone bills from $0.00 to $15.00). Thus, the relative
frequency for this class is 71 200 (the total # of phone
bills) = 0.355 (or 35.5%)
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Ogive
Is a graph of a cumulative frequency distribution.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Cumulative Relative Frequencies
first class
:
:
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Ogive
Is a graph of a cumulative frequency distribution.
1) Calculate relative frequencies.
2) Calculate cumulative relative frequencies.
3) Graph the cumulative relative frequencies
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Ogive
around $35
(Refer also to Fig. 2.13 in your textbook)
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Numerical Descriptive Techniques
Measures of Central Location
Mean, Median, Mode
Measures of Variability
Range, Standard Deviation, Variance, Coefficient of Variation
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Notation
When referring to the number of observations in a
population, we use uppercase letter N
Sample Mean
Population Mean
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
The Arithmetic Mean
is appropriate for describing measurement data, e.g.
heights of people, marks of student papers, etc.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Measures of Central Location
The median is calculated by placing all the observations in
order; the observation that falls in the middle is the median.
A set of data may have one mode (or modal class), or two, or
more modes.
Mode is a useful for all data types, though mainly used for
nominal data.
For large data sets the modal class is much more relevant
than a single-value mode.
A modal class
Frequency
Variable
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
=MODE(range) in Excel
Note: if you are using Excel for your data analysis and your
data is multi-modal (i.e. there is more than one mode), Excel
only calculates the smallest one.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, Mode
If a distribution is symmetrical,
the mean, median and mode may coincide
median
mode
mean
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, Mode
If a distribution is asymmetrical, say skewed to the left or to
the right, the three measures may differ. E.g.:
median
mode
mean
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, Mode: Which Is Best?
With three measures from which to choose, which one
should we use?
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, Mode: Which Is Best?
To illustrate, consider the data in Example 4.1.
x
i 1
i
0 7 12 5 133 14 8 0 22 210
x 21.0
n 10 10
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, Mode: Which Is Best?
This value is only exceeded by only two of the ten
observations in the sample, making this statistic a poor
measure of central location.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean, Median, & Modes for Ordinal & Nominal Data
For ordinal and nominal data the calculation of the mean is
not valid.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Measures of Central Location Summary
Compute the Mean to
Describe the central location of a single set of interval
data
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Measures of Variability
Measures of central location fail to tell the whole story about
the distribution; that is, how much are the observations
spread out around the mean value?
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Range
The range is the simplest measure of variability, calculated
as:
E.g.
Data: {4, 4, 4, 4, 50} Range = 46
Data: {4, 8, 15, 24, 39, 50} Range = 46
The range is the same in both cases,
but the data sets have very different distributions
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Range
Its major advantage is the ease with which it can be
computed.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variance
Variance and its related measure, standard deviation, are
arguably the most important statistics. Used to measure
variability, they also play a vital role in almost all statistical
inference procedures.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variance
population mean
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Variance
As you can see, you have to calculate the sample mean (x-
bar) in order to calculate the sample variance.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Application
Example 4.7. The following sample consists of the number
of jobs six students applied for: 17, 15, 23, 7, 9, 13.
Finds its mean and variance.
as opposed to or 2
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Sample Mean & Variance
Sample Mean
Sample Variance
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Standard Deviation
The standard deviation is simply the square root of the
variance, thus:
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Standard Deviation
Consider Example 4.8 [Xm04-08]where a golf club manufacturer has
designed a new club and wants to determine if it is hit more
consistently (i.e. with less variability) than with an old club.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Interpreting Standard Deviation
The standard deviation can be used to compare the variability of
several distributions and make a statement about the general shape
of a distribution. If the histogram is bell shaped, we can use the
Empirical Rule, which states:
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
The Empirical Rule
Approximately 68% of all observations fall
within one standard deviation of the mean.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Interpreting Standard Deviation
Suppose that the mean and standard deviation of last years
midterm test marks are 70 and 5, respectively. If the
histogram is bell-shaped then we know that approximately
68% of the marks fell between 65 and 75, approximately
95% of the marks fell between 60 and 80, and approximately
99.7% of the marks fell between 55 and 85.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Coefficient of Variation
The coefficient of variation of a set of observations is the
standard deviation of the observations divided by their mean,
that is:
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Coefficient of Variation
This coefficient provides a
proportionate measure of variation, e.g.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Measures of Relative Standing & Box Plots
Measures of relative standing are designed to provide
information about the position of particular values relative to
the entire data set.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Quartiles
We have special names for the 25th, 50th, and 75th
percentiles, namely quartiles.
We can also convert percentiles into quintiles (fifths) and deciles (tenths).
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Commonly Used Percentiles
First (lower) decile = 10th percentile
First (lower) quartile, Q1, = 25th percentile
Second (middle)quartile,Q2, = 50th percentile
Third quartile, Q3, = 75th percentile
Ninth (upper) decile = 90th percentile
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Location of Percentiles
The following formula allows us to approximate the location
of any percentile:
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Location of Percentiles
Recall the data from Example 4.1:
0 0 5 7 8 9 12 14 22 33
0 0 5 7 8 9 12 14 22 33
It is located one-quarter of the distance between the eighth and the ninth
observations, which are 14 and 22, respectively. One-quarter of the distance
is: (.25)(22 - 14) = 2, which means the 75th percentile is at: 14 + 2 = 16
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Location of Percentiles
Please remember
position
2.75 16
0 0 | 5 7 8 9 12 14 | 22 33
position
3.75 8.25
Lp determines the position in the data set where the percentile value lies,
not the value of the percentile itself.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Interquartile Range
The quartiles can be used to create another measure of
variability, the interquartile range, which is defined as
follows:
Interquartile Range = Q3 Q1
Large values of this statistic mean that the 1st and 3rd
quartiles are far apart indicating a high level of variability.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Box Plots
The box plot is a technique that graphs five statistics:
the minimum and maximum observations, and
Whisker
Whisker (1.5*(Q3Q1))
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Box Plots
These box plots are based on
data in Xm04-15.
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
3.5 Mean-Variance Analysis and
the Sharpe Ratio
Explain mean-variance analysis and the Sharpe Ratio.
Mean-variance analysis:
The performance of an asset is measured by its rate of return.
The rate of return may be evaluated in terms of its reward
(mean) and risk (variance).
Higher average returns are often associated with higher risk.
The Sharpe ratio uses the mean and variance to
evaluate risk.
LO 3.5
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean-Variance Analysis and
the Sharpe Ratio
Sharpe Ratio
Measures the extra reward per unit of risk.
For an investment , the Sharpe ratio is computed as:
x R
Sharpe Ratio
s
where is the mean return for the investment
is the mean return for a risk-free asset
is the standard deviation for the investment
LO 3.5
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Mean-Variance Analysis and
the Sharpe Ratio
Sharpe Ratio Example
Compute the Sharpe ratios for the Metals and Income funds
given the risk free return of 4%.
Since 0.56 > 0.41, the Metals fund offers more reward per unit
of risk as compared to the Income fund.
LO 3.5
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.
Parameters and Statistics
Population Sample
Size N n
Mean
Variance S2
Standard
Deviation S
Coefficient of
Variation CV cv
Covariance Sxy
Coefficient of
Correlation r
2015 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a
license distributed with a certain product or service or otherwise on a password-protected website for classroom use.