What Is Statistics?: "Statistics Is A Way To Get Information From Data"

What is Statistics?
1.1
Statistics is a way to get information from data

Statistics
Data Information
Data: Facts, especially Information: Knowledge

numerical facts, collected communicated concerning
together for reference or some particular fact.
information.
Statistics is a tool for creating new understanding from a set of numbers.
Definitions: Oxford English Dictionary

Meaning of term statistics
Following three meanings sense
Plural sense
Singular sense
Plural of the word statistics
Plural sense
Any systematically collected data for a specific purpose.
Is describe as the statistics in plural sense
For example: statistics of prices. Road accidents, crime birth, educational
institutions
Singular sense
Describe a body of procedures and techniques used to collect process
and analyze numeric data to make inferences and reach decision.
Plural of the word statistics :
Which mean a numerical quantity calculated from sample observation?
Definition of Statistics:
Definition of Statistics:
1. A collection of quantitative data pertaining to a
subject or group. Examples are blood pressure
statistics etc.
2. The science that deals with the collection,
tabulation, analysis, interpretation, and
presentation of quantitative data
Kinds of Statistics
1) Descriptive Statistics
2) Statistical Inference
Descriptive Statistics
1.5
are methods of organizing, summarizing, and

presenting data in a convenient and informative way.
These methods include:
Graphical Techniques
Numerical Techniques
The actual method used depends on what information
we would like to extract. Are we interested in
measure(s) of central location? and/or
measure(s) of variability (dispersion)?
Descriptive Statistics helps to answer these questions

Statistical Inference
1.6
Statistical inference is the process of making an

estimate, prediction, or decision about a population
based on a sample. Sample
Population Inference
Population
Statistic
Parameter
What can we infer about a Populations Parameters
based on a Samples? Statistics
Variables
7
A variable is a characteristic or condition that can

change or take on different values.
Most research begins with a general question about
the relationship between two variables for a
specific group of individuals.
Population
8
The entire group of individuals is called the

population.
For example, a researcher may be interested in the
relation between class size (variable 1) and
academic performance (variable 2) for the
population of third-grade children.
Sample
9
Usually populations are so large that a researcher

cannot examine the entire group. Therefore, a
sample is selected to represent the population in a
research study. The goal is to use the results
obtained from the sample to help answer questions
about the population.
Types of Variables
11
Variables can be classified as discrete or

continuous.
Discrete variables (such as class size) consist of
indivisible categories, and continuous variables
(such as time or weight) are infinitely divisible into
whatever units a researcher may choose. For
example, time can be measured to the nearest
minute, second, half-second, etc.
Real Limits
12
To define the units for a continuous variable, a

researcher must use real limits which are
boundaries located exactly half-way between
adjacent categories.
Measuring Variables
13
To establish relationships between variables,

researchers must observe the variables and record
their observations. This requires that the variables
be measured.
The process of measuring a variable requires a set
of categories called a scale of measurement and a
process that classifies each individual into one
category.
4 Types of Measurement Scales
14
1. A nominal scale is an unordered set of

categories identified only by name. Nominal
measurements only permit you to determine
whether two individuals are the same or different.
2. An ordinal scale is an ordered set of categories.
Ordinal measurements tell you the direction of
difference between two individuals.
4 Types of Measurement Scales
15
3. An interval scale is an ordered series of equal-sized

categories. Interval measurements identify the
direction and magnitude of a difference. The zero
point is located arbitrarily on an interval scale.
4. A ratio scale is an interval scale where a value of
zero indicates none of the variable. Ratio
measurements identify the direction and magnitude
of differences and allow ratio comparisons of
measurements.
Experiments
16
The goal of an experiment is to demonstrate a

cause-and-effect relationship between two
variables; that is, to show that changing the value of
one variable causes changes to occur in a second
variable.
Experiments (cont.)
17
In an experiment, one variable is manipulated to

create treatment conditions. A second variable is
observed and measured to obtain scores for a group
of individuals in each of the treatment conditions. The
measurements are then compared to see if there are
differences between treatment conditions. All other
variables are controlled to prevent them from
influencing the results.
In an experiment, the manipulated variable is called
the independent variable and the observed variable
is the dependent variable.
Definitions
1.18
A variable is some characteristic of a population or

sample.
E.g. student grades.
Typically denoted with a capital letter: X, Y, Z
The values of the variable are the range of possible

values for a variable.
E.g. student marks (0..100)
Data are the observed values of a variable.

E.g. student marks: {67, 74, 71, 83, 93, 55, 48}
Interval Data
1.19
Interval data
Real numbers, i.e. heights, weights, prices, etc.
Also referred to as quantitative or numerical.
Arithmetic operations can be performed on Interval

Data, thus its meaningful to talk about 2*Height, or
Price + $1, and so on.
Nominal Data
1.20
Nominal Data
The values of nominal data are categories.
E.g. responses to questions about marital status, coded
as:
Single = 1, Married = 2, Divorced = 3, Widowed = 4
Because the numbers are arbitrary arithmetic operations

dont make any sense (e.g. does Widowed 2 = Married?!)
Nominal data are also called qualitative or categorical.

Ordinal Data
1.21
Ordinal Data appear to be categorical in nature, but their

values have an order; a ranking to them:
E.g. College course rating system:

poor = 1, fair = 2, good = 3, very good = 4, excellent = 5
While its still not meaningful to do arithmetic on this data

(e.g. does 2*fair = very good?!), we can say things like:
excellent > poor or fair < very good
That is, order is maintained no matter what numeric values

are assigned to each category.
Graphical & Tabular Techniques for Nominal Data
1.22
The only allowable calculation on nominal data is to

count the frequency of each value of the variable.
We can summarize the data in a table that presents

the categories and their counts called a frequency
distribution.
A relative frequency distribution lists the categories

and the proportion with which each occurs.
Nominal Data (Tabular Summary)
1.23
Nominal Data (Frequency)
1.24
Bar Charts are often used to display frequencies

Nominal Data
1.25
It all the same information,

(based on the same data).
Just different presentation.
Graphical Techniques for Interval Data
1.26
There are several graphical methods that are used

when the data are interval (i.e. numeric, non-
categorical).
The most important of these graphical methods is

the histogram.
The histogram is not only a powerful graphical

technique used to summarize interval data, but it is
also used to help explain probabilities.
Building a Histogram
1.27
1) Collect the Data

2) Create a frequency distribution for the data.
3) Draw the Histogram.
Histogram and Stem & Leaf
1.28
Ogive
1.29
Is a graph of a cumulative frequency distribution.
We create an ogive in three steps

1) Calculate relative frequencies.
2) Calculate cumulative relative frequencies by
adding the current class relative frequency to the
previous class cumulative relative frequency.
(For the first class, its cumulative relative frequency is just its relative
frequency)
Cumulative Relative Frequencies
1.30
first class
next class: .355+.185=.540
:
:
last class: .930+.070=1.00

Ogive
1.31
The ogive can be used to

answer questions like:
What telephone bill value

is at the 50th percentile?
around $35
(Refer also to Fig. 2.13 in your textbook)
Scatter Diagram
1.32
Example 2.9 A real estate agent wanted to know

to what extent the selling price of a home is
related to its size
1) Collect the data

2) Determine the independent variable (X house
size) and the dependent variable (Y selling
price)
3) Use Excel to create a scatter diagram
Scatter Diagram
1.33
It appears that in fact there is a relationship, that is,

the greater the house size the greater the selling
price
Patterns of Scatter Diagrams
1.34
Linearity and Direction are two concepts we are

interested in
Positive Linear Relationship Negative Linear Relationship
Weak or Non-Linear Relationship

Time Series Data
1.35
Observations measured at the same point in time

are called cross-sectional data.
Observations measured at successive points in time

are called time-series data.
Time-series data graphed on a line chart, which

plots the value of the variable on the vertical axis
against the time periods on the horizontal axis.
Numerical Descriptive Techniques
1.36
Measures of Central Location

Mean, Median, Mode
Measures of Variability
Range, Standard Deviation, Variance, Coefficient of Variation
Measures of Relative Standing

Percentiles, Quartiles
Measures of Linear Relationship

Covariance, Correlation, Least Squares Line
Measures of Central Location
1.37
The arithmetic mean, a.k.a. average, shortened to

mean, is the most popular & useful measure of
central location.
It is computed by simply adding up all the

observations and dividing by the total number of
observations:
Sum of the observations
Mean =
Number of observations
Arithmetic Mean
Sample Mean
Population Mean
1.38
Statistics is a pattern language
1.39
Population Sample
Size N n
Mean
The Arithmetic Mean
1.40
is appropriate for describing measurement data,

e.g. heights of people, marks of student papers, etc.
is seriously affected by extreme values called

outliers. E.g. as soon as a billionaire moves into a
neighborhood, the average household income
increases beyond what it was previously!
Measures of Variability
1.41
Measures of central location fail to tell the whole

story about the distribution; that is, how much are
the observations spread out around the mean
value?
For example, two sets of class grades are
shown. The mean (=50) is the same in
each case
But, the red class has greater variability

than the blue class.
Range
1.42
The range is the simplest measure of variability,

calculated as:
Range = Largest observation Smallest observation
E.g.
Data: {4, 4, 4, 4, 50} Range = 46
Data: {4, 8, 15, 24, 39, 50} Range = 46
The range is the same in both cases,
but the data sets have very different
distributions
Statistics is a pattern language
1.43 Population Sample
Size N n
Mean
Variance
Variance
1.44
population mean

population
The variance of size
a population is: sample mean
The variance of a sample is:

Note! the denominator is sample size (n) minus one !
Application
1.45
Example 4.7. The following sample consists of the

number of jobs six randomly selected students applied
for: 17, 15, 23, 7, 9, 13.
Finds its mean and variance.
What are we looking to calculate?
The following sample consists of the number of jobs six

randomly selected students applied for: 17, 15, 23, 7,
9, 13.
Finds its mean and variance.
as opposed to or 2
Sample Mean & Variance
Sample Mean
1.46
Sample Variance
Sample Variance (shortcut method)

Standard Deviation
1.47
The standard deviation is simply the square root of

the variance, thus:
Population standard deviation:
Sample standard deviation:

Standard Deviation
1.48
Consider Example 4.8 where a golf club

manufacturer has designed a new club and wants to
determine if it is hit more consistently (i.e. with less
variability) than with an old club.
Using Tools > Data Analysis [may need to add in > Descriptive
Statistics in Excel, we produce the following tables
for interpretation
You get more
consistent distance
with the new club.
The Empirical Rule If the histogram is bell shaped
Approximately 68% of all observations fall
1.49
within one standard deviation of the mean.
Approximately 95% of all observations fall

within two standard deviations of the mean.
Approximately 99.7% of all observations fall

within three standard deviations of the mean.
Chebysheffs TheoremNot often used because interval is very wide.
1.50
A more general interpretation of the standard

deviation is derived from Chebysheff s Theorem,
which applies to all shapes of histograms (not just
bell shaped).
The proportion of observations

For k=2 in anythesample
(say), theoremthat
states
lie that at least 3/4 of all observations
lie within 2 standard deviations of
within k standard deviations of theThis
the mean. mean is at bound
is a lower least:
compared to Empirical Rules
approximation (95%).
Box Plots
These box plots are
1.51
based on data in Xm04-
15.
Wendys service time is

shortest and least
variable.
Hardees has the greatest

variability, while Jack-in-
the-Box has the longest
service times.
Methods of Collecting Data
1.52
There are many methods used to collect or obtain

data for statistical analysis. Three of the most
popular methods are:
Direct Observation
Experiments, and
Surveys.
Sampling
1.53
Recall that statistical inference permits us to draw

conclusions about a population based on a sample.
Sampling (i.e. selecting a sub-set of a whole

population) is often done for reasons of cost (its less
expensive to sample 1,000 television viewers than 100
million TV viewers) and practicality (e.g. performing a
crash test on every automobile produced is impractical).
In any case, the sampled population and the target

population should be similar to one another.
Sampling Plans
1.54
A sampling plan is just a method or procedure for

specifying how a sample will be taken from a
population.
We will focus our attention on these three methods:
Simple Random Sampling,

Stratified Random Sampling, and
Cluster Sampling.
Simple Random Sampling
1.55
A simple random sample is a sample selected in such

a way that every possible sample of the same size
is equally likely to be chosen.
Drawing three names from a hat containing all the

names of the students in the class is an example of
a simple random sample: any group of three names
is as equally likely as picking any other group of
three names.
Stratified Random Sampling
1.56
After the population has been stratified, we can use

simple random sampling to generate the complete
sample:
If we only have sufficient resources to sample 400 people total,

we would draw 100 of them from the low income group
if we are sampling 1000 people, wed draw

50 of them from the high income group.
Cluster Sampling
1.57
A cluster sample is a simple random sample of groups

or clusters of elements (vs. a simple random sample of
individual objects).
This method is useful when it is difficult or costly to

develop a complete list of the population members or
when the population elements are widely dispersed
geographically.
Cluster sampling may increase sampling error due to

similarities among cluster members.
Sampling Error
1.58
Sampling error refers to differences between the sample and

the population that exist only because of the observations
that happened to be selected for the sample.
Another way to look at this is: the differences in results for

different samples (of the same size) is due to sampling
error:
E.g. Two samples of size 10 of 1,000 households. If we

happened to get the highest income level data points in our
first sample and all the lowest income levels in the second,
this delta is due to sampling error.
Nonsampling Error
1.59
Nonsampling errors are more serious and are due to

mistakes made in the acquisition of data or due to the
sample observations being selected improperly. Three
types of nonsampling errors:
Errors in data acquisition,

Nonresponse errors, and
Selection bias.
Note: increasing the sample size will not reduce this

type of error.
Approaches to Assigning
1.60
Probabilities
There are three ways to assign a probability, P(Oi), to
an outcome, Oi, namely:
Classical approach: make certain assumptions (such as

equally likely, independence) about situation.
Relative frequency: assigning probabilities based on

experimentation or historical data.
Subjective approach: Assigning probabilities based on

the assignors judgment.
Interpreting Probability
1.61
One way to interpret probability is this:
If a random experiment is repeated an infinite number

of times, the relative frequency for any given outcome
is the probability of this outcome.
For example, the probability of heads in flip of a

balanced coin is .5, determined using the classical
approach. The probability is interpreted as being the
long-term relative frequency of heads if the coin is
flipped an infinite number of times.
Conditional Probability
1.62
Conditional probability is used to determine how two

events are related; that is, we can determine the
probability of one event given the occurrence of
another related event.
Conditional probabilities are written as P(A | B)

and read as the probability of A given B and is
calculated as:
Independence
1.63
One of the objectives of calculating conditional

probability is to determine whether two events are
related.
In particular, we would like to know whether they are

independent, that is, if the probability of one event is
not affected by the occurrence of the other event.
Two events A and B are said to be independent if

P(A|B) = P(A)
or
P(B|A) = P(B)
Complement Rule
1.64
The complement of an event A is the event that occurs when

A does not occur.
The complement rule gives us the probability of an event

NOT occurring. That is:
P(AC) = 1 P(A)
For example, in the simple roll of a die, the probability of

the number 1 being rolled is 1/6. The probability that
some number other than 1 will be rolled is 1 1/6 = 5/6.
Multiplication Rule
1.65
The multiplication rule is used to calculate the joint

probability of two events. It is based on the formula
for conditional probability defined earlier:
If we multiply both sides of the equation by P(B) we have:
P(A and B) = P(A | B)P(B)
Likewise, P(A and B) = P(B | A) P(A)
If A and B are independent events, then P(A and B) = P(A)P(B)

Addition Rule
1.66
Recall: the addition rule was introduced earlier to

provide a way to compute the probability of event
A or B or both A and B occurring; i.e. the union of A
and B.

P(A or B) = P(A) + P(B) P(A and B)
Why do we subtract the joint probability P(A and B)

P(A
fromorthe
B) sum
= P(A)
of +
theP(B) P(A and B)
probabilities of A and B?
Addition Rule for Mutually Excusive Events
1.67
If and A and B are mutually exclusive the occurrence of one

event makes the other one impossible. This means that

P(A and B) = 0
The addition rule for mutually exclusive events is

P(A or B) = P(A) + P(B)
We often use this form when we add some joint probabilities

calculated from a probability tree
Two Types of Random Variables
1.68
Discrete Random Variable

one that takes on a countable number of values
E.g. values on the roll of dice: 2, 3, 4, , 12
Continuous Random Variable

one whose values are not discrete, not countable
E.g. time (30.1 minutes? 30.10000001 minutes?)
Analogy:
Integers are Discrete, while Real Numbers are Continuous
Laws of Expected Value
1.69
1. E(c) = c
The expected value of a constant (c) is just the value of
the constant.
2. E(X + c) = E(X) + c
3. E(cX) = cE(X)
We can pull a constant out of the expected value
expression (either as part of a sum with a random
variable X or as a coefficient of random variable X).
Laws of Variance
1.70
1. V(c) = 0
The variance of a constant (c) is zero.
2. V(X + c) = V(X)
The variance of a random variable and a constant is just
the variance of the random variable (per 1 above).
3. V(cX) = c2V(X)
The variance of a random variable and a constant
coefficient is the coefficient squared times the variance of
the random variable.
Binomial Distribution
1.71
The binomial distribution is the probability distribution

that results from doing a binomial experiment.
Binomial experiments have the following properties:
1. Fixed number of trials, represented as n.

2. Each trial has two possible outcomes, a success and
a failure.
3. P(success)=p (and thus: P(failure)=1p), for all trials.
4. The trials are independent, which means that the
outcome of one trial does not affect the outcomes of
any other trials.
Binomial Random Variable
1.72
The binomial random variable counts the number of

successes in n trials of the binomial experiment. It
can take on values from 0, 1, 2, , n. Thus, its a
discrete random variable.
To calculate the probability associated with each

value we use combintorics:
for x=0, 1, 2, , n
Binomial Table
1.73
What is the probability that Pat fails the quiz?

i.e. what is P(X 4), given P(success) = .20 and
n=10 ?
P(X 4) = .967
Binomial Table
1.74
What is the probability that Pat gets two answers

correct?
i.e. what is P(X = 2), given P(success) = .20 and
n=10 ?
P(X = 2) = P(X2) P(X1) = .678 .376 = .302

remember, the table shows cumulative probabilities
=BINOMDIST() Excel Function
1.75
There is a binomial distribution function in Excel that

can also be used to calculate these probabilities.
For example: # successes
What is the probability that Pat gets two answers

# trials
correct?
P(success)
cumulative
(i.e. P(Xx)?)
P(X=2)=.3020
=BINOMDIST() Excel Function
1.76
There is a binomial distribution function in Excel that

can also be used to calculate these probabilities.
For example: # successes
What is the probability that Pat fails the quiz?

# trials
P(success)
cumulative
(i.e. P(Xx)?)
P(X4)=.9672
Binomial Distribution
1.77
As you might expect, statisticians have developed

general formulas for the mean, variance, and
standard deviation of a binomial random variable.
They are:
Poisson Distribution
1.78
Named for Simeon Poisson, the Poisson distribution is a

discrete probability distribution and refers to the
number of events (a.k.a. successes) within a specific time
period or region of space. For example:
The number of cars arriving at a service station in 1 hour.
(The interval of time is 1 hour.)
The number of flaws in a bolt of cloth. (The specific region
is a bolt of cloth.)
The number of accidents in 1 day on a particular stretch of
highway. (The interval is defined by both time, 1 day, and
space, the particular stretch of highway.)
The Poisson Experiment
1.79
Like a binomial experiment, a Poisson experiment has

four defining characteristic properties:
1. The number of successes that occur in any interval is
independent of the number of successes that occur in
any other interval.
2. The probability of a success in an interval is the same
for all equal-size intervals
3. The probability of a success is proportional to the size
of the interval.
4. The probability of more than one success in an interval
approaches 0 as the interval becomes smaller.
1.80
The Poisson random variable is the number of successes

successes
that occur in a period of time or an interval of space in
a Poisson experiment.
E.g. On average, 96 trucks arrive at a border crossing

time period
every hour.
E.g. The number of typographic errors in a new

textbook edition averages 1.5 per 100 pages.
successes (?!) interval
Poisson Probability Distribution
1.81
The probability that a Poisson random variable

assumes a value of x is given by:
and e is the natural logarithm base.
FYI:
Example 7.12
1.82
The number of typographical errors in new editions

of textbooks varies considerably from book to
book. After some analysis he concludes that the
number of errors is Poisson distributed with a mean
of 1.5 per 100 pages. The instructor randomly
selects 100 pages of a new book. What is the
probability that there are no typos?
That is, what is P(X=0) given that = 1.5?

There is about a 22% chance of finding zero errors
1.83
As mentioned on the Poisson experiment slide:
The probability of a success is proportional to the size of

the interval
Thus, knowing an error rate of 1.5 typos per 100

pages, we can determine a mean value for a 400 page
book as:
=1.5(4) = 6 typos / 400 pages.

Example 7.13
1.84
For a 400 page book, what is the probability that

there are
no typos?
P(X=0) =
there is a very small chance there are no typos
Example 7.13
1.85
Excel is an even better alternative:

Probability Density Functions
1.86
Unlike a discrete random variable which we studied

in Chapter 7, a continuous random variable is one
that can assume an uncountable number of values.
We cannot list the possible values because there
is an infinite number of them.
Because there is an infinite number of values, the
probability of each individual value is virtually 0.
Point Probabilities are Zero
1.87
Because there is an infinite number of values, the

probability of each individual value is virtually 0.
Thus, we can determine the probability of a range of

values only.
E.g. with a discrete random variable like tossing a die, it is

meaningful to talk about P(X=5), say.
In a continuous setting (e.g. with time as a random variable), the
probability the random variable of interest, say task length, takes
exactly 5 minutes is infinitesimally small, hence P(X=5) = 0.
It is meaningful to talk about P(X 5).
Probability Density Function
1.88
A function f(x) is called a probability density function

(over the range a x b if it meets the following
requirements:
1) f(x) 0 for all x between a and b, and

f(x)
area=1
a b x
2) The total area under the curve between a and b is 1.0

The Normal Distribution
1.89
The normal distribution is the most important of all

probability distributions. The probability density
function of a normal random variable is given by:
It looks like this:

Bell shaped,
Symmetrical around the mean
The Normal Distribution
1.90
Important things to note:

The normal distributionis fully defined by two parameters:
its standard deviation and mean
The normal distribution is bell shaped and

symmetrical about the mean
Unlike the range of the uniform distribution (a x b)

Normal distributions range from minus infinity to plus infinity
Standard Normal Distribution
1.91
A normal distribution whose mean is zero and standard

deviation is one is called
0
the standard normal distribution.
1
As we shall see shortly, any normal distribution can be

converted to a standard normal distribution with simple
algebra. This makes calculations much easier.
Calculating Normal Probabilities
1.92
We can use the following function to convert any

normal random variable to a standard normal
random variable
Some advice: always

draw a picture!
1.93
Example: The time required to build a computer is

normally distributed with a mean of 50 minutes and a
standard deviation of 10 minutes:
What is the probability that a computer is assembled in

a time between 45 and 60 minutes?
Algebraically speaking, what is P(45 < X < 60) ?

1.94
mean of 50 minutes and a
standard deviation of 10 minutes
P(45 < X < 60) ?
0
1.95
We can use Table 3 in

Appendix B to look-up
probabilities P(0 < Z < z)
We can break up P(.5 < Z < 1) into:

P(.5 < Z < 0) + P(0 < Z < 1)
The distribution is symmetric around zero, so we have:

P(.5 < Z < 0) = P(0 < Z < .5)
Hence: P(.5 < Z < 1) = P(0 < Z < .5) + P(0 < Z < 1)
1.96
How to use Table 3

This table gives probabilities P(0 < Z < z)
First column = integer + first decimal
Top row = second decimal place
P(0 < Z < 0.5)
P(0 < Z < 1)
P(.5 < Z < 1) = .1915 + .3414 = .5328

Using the Normal Table (Table 3)
1.97
What is P(Z > 1.6) ? P(0 < Z < 1.6) = .4452
0 1.6
P(Z > 1.6) = .5 P(0 < Z < 1.6)

= .5 .4452
= .0548
1.98
What is P(Z < -2.23) ? P(0 < Z < 2.23)
P(Z < -2.23) P(Z > 2.23)
-2.23 0 2.23
P(Z < -2.23) = P(Z > 2.23)

= .5 P(0 < Z < 2.23)
= .0129
1.99
What is P(Z < 1.52) ?

P(Z < 0) = .5 P(0 < Z < 1.52)
0 1.52
P(Z < 1.52) = .5 + P(0 < Z < 1.52)

= .5 + .4357
= .9357
1.100
What is P(0.9 < Z < 1.9)P(0? < Z < 0.9)
P(0.9 < Z < 1.9)
0 0.9 1.9
P(0.9 < Z < 1.9) = P(0 < Z < 1.9) P(0 < Z < 0.9)
=.4713 .3159
= .1554
Finding Values of Z
1.101
Other Z values are

Z.05 = 1.645
Z.01 = 2.33
Using the values of Z
1.102
Because z.025 = 1.96 and - z.025= -1.96, it follows

that we can state
P(-1.96 < Z < 1.96) = .95
Similarly
P(-1.645 < Z < 1.645) = .90
Other Continuous Distributions
1.103
Three other important continuous distributions which

will be used extensively in later sections are
introduced here:
Student t Distribution,
Chi-Squared Distribution, and
F Distribution.
Student t Distribution
1.104
Here the letter t is used to represent the random

variable, hence the name. The density function for
the Student t distribution is as follows
(nu) is called the degrees of freedom, and

(Gamma function) is (k)=(k-1)(k-2)(2)(1)
Student t Distribution
1.105
In much the same way that and define the normal

distribution, , the degrees of freedom, defines the
Student
t Distribution:
Figure 8.24
As the number of degrees of freedom increases, the t
distribution approaches the standard normal distribution.
Determining Student t Values
1.106
The student t distribution is used extensively in statistical

inference. Table 4 in Appendix B lists values of
That is, values of a Student t random variable with

degrees of freedom such that:
The values for A are pre-determined

critical values, typically in the
10%, 5%, 2.5%, 1% and 1/2% range.
Using the t table (Table 4) for values
1.107
For example, ifArea

we under
wantthe
thecurve
value of t with 10
value (tA) : COLUMN
degrees of freedom such that the area under the
Student t curve is .05:
t.05,10
t.05,10=1.812
Degrees of Freedom : ROW

F Distribution
1.108
The F density function is given by:
F > 0. Two parameters define this distribution, and like

weve already seen these are again degrees of
freedom.
is the numerator degrees of freedom and
is the denominator degrees of freedom.
Determining Values of F
1.109
For example, what is the value of F for 5% of the

area under the right hand tail of the curve, with a
numerator degree of freedom of 3 and a
denominator
There are different tables degree of freedom of 7?
for different values of A.
Make Solution:
sure you start use
with the F look-up (Table 6)
the correct table!!
F.05,3,7=4.35
F.05,3,7
Denominator Degrees of Freedom : ROW
Numerator Degrees of Freedom : COLUMN
Determining Values of F
1.110
For areas under the curve on the left hand side of

the curve, we can leverage the following
relationship:
Pay close attention to the order of the terms!

CHAPTER 9
Sampling Distributions
1.111
Sampling Distribution of the Mean
1.112
A fair die is thrown infinitely many times,

with the random variable X = # of spots on any throw.
Thexprobability
1 2distribution
3 of4X is: 5 6
P(x) 1/6 1/6 1/6 1/6 1/6 1/6
and the mean and variance are calculated as well:

Sampling Distribution of Two Dice
1.113
A sampling distribution is created by looking at

all samples of size n=2 (i.e. two dice) and their means
While there are 36 possible samples of size 2, there are

only 11 values for , and some (e.g. =3.5) occur more
frequently than others (e.g. =1).
Sampling Distribution of Two Dice
1.114
The
P( sampling
) distribution of
6/36
is shown below:
1.0 1/36 5/36
1.5 2/36
2.0 3/36
4/36
2.5 4/36
)
3.0 5/36
3.5 6/36 3/36
P(
4.0 5/36
4.5 4/36 2/36
5.0 3/36
5.5 2/36
6.0 1/36 1/36
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
Compare
1.115
Compare the distribution of X
1 2 3 4 5 6 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0
with the sampling distribution of .
As well, note that:

Central Limit Theorem
1.116
The sampling distribution of the mean of a random

sample drawn from any population is approximately
normal for a sufficiently large sample size.
The larger the sample size, the more closely the

sampling distribution of X will resemble a normal
distribution.
Central Limit Theorem
1.117
If the population is normal, then X is normally

distributed for all values of n.
If the population is non-normal, then X is

approximately normal only for larger values of n.
In many practical situations, a sample size of 30

may be sufficiently large to allow us to use the
normal distribution as an approximation for the
sampling distribution of X.
Sampling Distribution of the Sample
1.118
Mean
1.
2.
3. If X is normal, X is normal. If X is nonnormal, X is

approximately normal for sufficiently large sample sizes.
Note: the definition of sufficiently large depends on
the extent of nonnormality of x (e.g. heavily skewed;
multimodal)
Example 9.1(a)
1.119
The foreman of a bottling plant has observed that

the amount of soda in each 32-ounce bottle is
actually a normally distributed random variable,
with a mean of 32.2 ounces and a standard
deviation of .3 ounce.
If a customer buys one bottle, what is the

probability that the bottle will contain more than
32 ounces?
Example 9.1(a)
1.120
We want to find P(X > 32), where X is normally

distributed and =32.2 and =.3
there is about a 75% chance that a single bottle of

soda contains more than 32oz.
Example 9.1(b)
1.121
The foreman of a bottling plant has observed that

the amount of soda in each 32-ounce bottle is
actually a normally distributed random variable,
with a mean of 32.2 ounces and a standard
deviation of .3 ounce.
If a customer buys a carton of four bottles, what is

the probability that the mean amount of the four
bottles will be greater than 32 ounces?
Example 9.1(b)
1.122
We want to find P(X > 32), where X is normally

distributed
with =32.2 and =.3
Things we know:
1) X is normally distributed, therefore so will X.
2) = 32.2 oz.
3)
Example 9.1(b)
1.123
If a customer buys a carton of four bottles, what is

the probability that the mean amount of the four
bottles will be greater than 32 ounces?
There is about a 91% chance the mean of the four

bottles will exceed 32oz.
Graphically Speaking mean=32.2
1.124
what is the probability that one bottle will what is the probability that the mean of
contain more than 32 ounces? four bottles will exceed 32 oz?
Sampling Distribution: Difference of two means
1.125
The final sampling distribution introduced is that of the

difference between two sample means. This requires:
independent random samples be drawn from each of

two normal populations
If this condition is met, then the sampling distribution of the

difference between the two sample means, i.e.
will be normally distributed.
(note: if the two populations are not both normally
distributed, but the sample sizes are large (>30), the
distribution of is approximately normal)
Sampling Distribution: Difference of two means
1.126
The expected value and variance of the sampling

distribution of are given by:
mean:
standard deviation:
(also called the standard error if the difference

between two means)
Estimation
1.127
There are two types of inference: estimation and

hypothesis testing; estimation is introduced first.
The objective of estimation is to determine the

approximate value of a population parameter on
the basis of a sample statistic.
E.g., the sample mean ( ) is employed to estimate

the population mean ( ).
Estimation
1.128
The objective of estimation is to determine the

approximate value of a population parameter on
the basis of a sample statistic.
There are two types of estimators:
Point Estimator
Interval Estimator
Point & Interval Estimation
1.129
For example, suppose we want to estimate the mean summer

income of a class of business students. For n=25 students,
is calculated to be 400 $/week.
point estimate interval estimate
An alternative statement is:

The mean income is between 380 and 420 $/week.
Estimating when is known
the confidence
1.130 interval
We established in Chapter 9:
the sample mean is

in the center of the
Thus, the probability that the interval: interval
contains the population mean is 1 . This is a

confidence interval estimator for .
Four commonly used confidence
1.131
levels
Confidence Level
cut & keep handy!

Table 10.1
Example 10.1
1.132
A computer company samples demand during lead time

over 25 time235periods:
374 309 499 253
421 361 514 462 369
394 439 348 344 330
261 374 302 466 535
386 316 296 332 334
Its is known that the standard deviation of demand over

lead time is 75 computers. We want to estimate the
mean demand over lead time with 95% confidence in
order to set inventory levels
CALCULATE
Example 10.1
1.133
In order to use our confidence interval estimator, we need the

following pieces of data: Calculated from the data
370.16
1.96
75
Given
n 25
therefore:
The lower and upper confidence limits are 340.76 and 399.56.
INTERPRET
Example 10.1
1.134
The estimation for the mean demand during lead time

lies between 340.76 and 399.56 we can use this as
input in developing an inventory policy.
That is, we estimated that the mean demand during

lead time falls between 340.76 and 399.56, and this
type of estimator is correct 95% of the time. That also
means that 5% of the time the estimator will be
incorrect.
Incidentally, the media often refer to the 95% figure as

19 times out of 20, which emphasizes the long-run
aspect of the confidence level.
Interval Width
1.135
A wide interval provides little information.

For example, suppose we estimate with 95% confidence
that an accountants average starting salary is between
$15,000 and $100,000.
Contrast this with: a 95% confidence interval estimate

of starting salaries between $42,000 and $45,000.
The second estimate is much narrower, providing

accounting students more precise information about
starting salaries.
Interval Width
1.136
The width of the confidence interval estimate is a

function of the confidence level, the population
standard deviation, and the sample size
Selecting the Sample Size
1.137
We can control the width of the interval by determining

the sample size necessary to produce narrow intervals.
Suppose we want to estimate the mean demand to

within 5 units; i.e. we want to the interval estimate to
be:
Since:
It follows that
Solve for n to get requisite sample size!
Selecting the Sample Size
1.138
Solving the equation
that is, to produce a 95% confidence interval

estimate of the mean (5 units), we need to sample
865 lead time periods (vs. the 25 data points we
have currently).
Sample Size to Estimate a Mean
1.139
The general formula for the sample size needed to

estimate a population mean with an interval
estimate of:
Requires a sample size of at least this large:

Example 10.2
1.140
A lumber company must estimate the mean

diameter of trees to determine whether or not there
is sufficient lumber to harvest an area of forest.
They need to estimate this to within 1 inch at a
confidence level of 99%. The tree diameters are
normally distributed with a standard deviation of 6
inches.
How many trees need to be sampled?

Example 10.2
1.141
Things we know:
Confidence level = 99%, therefore =.01
We want , hence W=1.

1 that
We are given = 6.
Example 10.2
1.142
We compute
1
That is, we will need to sample at least 239 trees to
have a
99% confidence interval of
Nonstatistical Hypothesis Testing
1.143
A criminal trial is an example of hypothesis testing without

the statistics.
In a trial a jury must decide between two hypotheses. The
null hypothesis is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is

H1: The defendant is guilty
The jury does not know which hypothesis is true. They must
make a decision on the basis of evidence presented.
1.144
There are two possible errors.

A Type I error occurs when we reject a true null
hypothesis. That is, a Type I error occurs when the
jury convicts an innocent person.
A Type II error occurs when we dont reject a false

null hypothesis. That occurs when a guilty defendant
is acquitted.
1.145
The probability of a Type I error is denoted as

(Greek letter alpha). The probability of a type II
error is (Greek letter beta).
The two probabilities are inversely related.

Decreasing one increases the other.
1.146
The critical concepts are theses:

1. There are two hypotheses, the null and the alternative
hypotheses.
2. The procedure begins with the assumption that the null
hypothesis is true.
3. The goal is to determine whether there is enough
evidence to infer that the alternative hypothesis is true.
4. There are two possible decisions:
Conclude that there is enough evidence to support the
alternative hypothesis.
Conclude that there is not enough evidence to support
the alternative hypothesis.
1.147
5. Two possible errors can be made.

Type I error: Reject a true null hypothesis
Type II error: Do not reject a false null
hypothesis.

P(Type I error) =
P(Type II error) =
Concepts of Hypothesis Testing (1)
1.148
There are two hypotheses. One is called the null hypothesis

and the other the alternative or research hypothesis. The usual
notation is:
pronounced
H nought
H0: the null hypothesis
H1: the alternative or research hypothesis
The null hypothesis (H0) will always state that the parameter
equals the value specified in the alternative hypothesis (H1)
Concepts of Hypothesis Testing
1.149
Consider Example 10.1 (mean demand for computers

during assembly lead time) again. Rather than estimate
the mean demand, our operations manager wants to
know whether the mean is different from 350 units. We
can rephrase this request into a test of the hypothesis:
H0: = 350
Thus, our research hypothesis becomes:

This is what we are interested
H1: 350 in determining
Concepts of Hypothesis Testing (4)
1.150
There are two possible decisions that can be made:
Conclude that there is enough evidence to support the

alternative hypothesis
(also stated as: rejecting the null hypothesis in favor of the
alternative)
Conclude that there is not enough evidence to support

the alternative hypothesis
(also stated as: not rejecting the null hypothesis in favor of
the alternative)
NOTE: we do not say that we accept the null hypothesis
Concepts of Hypothesis Testing
1.151
Once the null and alternative hypotheses are stated, the

next step is to randomly sample the population and
calculate a test statistic (in this example, the sample mean).
If the test statistics value is inconsistent with the null

hypothesis we reject the null hypothesis and infer that the
alternative hypothesis is true.
For example, if were trying to decide whether the mean is
not equal to 350, a large value of (say, 600) would
provide enough evidence. If is close to 350 (say, 355) we
could not say that this provides a great deal of evidence to
infer that the population mean is different than 350.
Types of Errors
1.152
A Type I error occurs when we reject a true null

hypothesis (i.e. Reject H0 when it is TRUE)
H0 T F
Reject I
Reject II
A Type II error occurs when we dont reject a false null

hypothesis (i.e. Do NOT reject H0 when it is FALSE)
Recap I
1.153
1) Two hypotheses: H0 & H1

2) ASSUME H0 is TRUE
3) GOAL: determine if there is enough evidence to
infer that H1 is TRUE
4) Two possible decisions:
Reject H0 in favor of H1
NOT Reject H0 in favor of H1
5) Two possible types of errors:
Type I: reject a true H0 [P(Type I)= ]
Type II: not reject a false H0 [P(Type II)= ]
Example 11.1
1.154
A department store manager determines that a new

billing system will be cost-effective only if the mean
monthly account is more than $170.
A random sample of 400 monthly accounts is drawn, for

which the sample mean is $178. The accounts are
approximately normally distributed with a standard
deviation of $65.
Can we conclude that the new system will be cost-

effective?
Example 11.1
1.155
The system will be cost effective if the mean account

balance for all customers is greater than $170.
We express this belief as a our research hypothesis, that is:
H 1: > 170 (this is what we want to determine)
Thus, our null hypothesis becomes:
H0: = 170 (this specifies a single value for the

parameter of interest)
Example 11.1
1.156
What we want to show:

H1: > 170
H0: = 170 (well assume this is true)
We know:
n = 400,
= 178, and
= 65
Hmm. What to do next?!

Example 11.1
1.157
To test our hypotheses, we can use two different

approaches:
The rejection region approach (typically used when

computing statistics manually), and
The p-value approach (which is generally used with a

computer and statistical software).
We will explore both in turn

Example 11.1 Rejection Region
1.158
The rejection region is a range of values such that if

the test statistic falls into that range, we decide to
reject the null hypothesis in favor of the alternative
hypothesis.
is the critical value of to reject H0.

Example 11.1
1.159
All thats left to do is calculate and compare it to

170.
we can calculate this based on any level of significance (

) we want
Example 11.1
1.160
At a 5% significance level (i.e. =0.05), we get
Solving we compute =175.34

Since our sample mean (178) is greater than the critical value we
calculated (175.34), we reject the null hypothesis in favor of H1, i.e.
that: > 170 and that it is cost effective to install the new billing
system
Example 11.1 The Big Picture
1.161
H 1: > 170 =175.34

H 0: = 170 =178
Reject H0 in favor of
Standardized Test Statistic
1.162
An easier method is to use the standardized test

statistic:
and compare its result to : (rejection region: z > )
Since z = 2.46 > 1.645 (z.05), we reject H0 in favor of

H1
PLOT POWER CURVE
1.163
p-Value
1.164
The p-value of a test is the probability of observing

a test statistic at least as extreme as the one
computed given that the null hypothesis is true.
In the case of our department store example, what

is the probability of observing a sample mean at
least as extreme as the one already observed (i.e.
= 178), given that the null hypothesis (H0: = 170)
is true?
p-value
Interpreting the p-value
1.165
The smaller the p-value, the more statistical evidence

exists to support the alternative hypothesis.
If the p-value is less than 1%, there is overwhelming
evidence that supports the alternative hypothesis.
If the p-value is between 1% and 5%, there is a strong
If the p-value is between 5% and 10% there is a weak
If the p-value exceeds 10%, there is no evidence that
supports the alternative hypothesis.
We observe a p-value of .0069, hence there is
overwhelming evidence to support H1: > 170.
Interpreting the p-value
1.166
Compare the p-value with the selected value of the

significance level:
If the p-value is less than , we judge the p-value to

be small enough to reject the null hypothesis.
If the p-value is greater than , we do not reject the

null hypothesis.
Since p-value = .0069 < = .05, we reject H0 in favor

of H1
Chapter-Opening Example
1.167
The objective of the study is to draw a conclusion about

the mean payment period. Thus, the parameter to be
tested is the population mean. We want to know
whether there is enough statistical evidence to show that
the population mean is less than 22 days. Thus, the
alternative hypothesis is

H1: < 22
The null hypothesis is

H0: = 22

1.168
The x
test statistic is
z
/ n
We wish to reject the null hypothesis in favor of the

alternative only if the sample mean and hence the
value of the test statistic is small enough. As a result we
locate the rejection region in the left tail of the
sampling distribution.
We set the significance level at 10%.

1.169
z z z.10 1.28
Rejection region:
From the data in SSA we compute
x
x

4,759
i
21 .63
and 220 220
x 21 .63 22
z .91
/ n 6 / 220
p-value = P(Z < -.91) = .5 - .3186 = .1814

1.170
Conclusion: There is not enough evidence to infer

that the mean is less than 22.
There is not enough evidence to infer that the plan

will be profitable.
Since Z(- .91) > -Z.10(-1.28)

We fail to reject Ho: > 22
at a 10% level of significance.
PLOT POWER CURVE
1.171

Right-Tail Testing
1.172
Calculate the critical value of the mean ( ) and

compare against the observed value of the sample
mean ( )
Left-Tail Testing
1.173
Calculate the critical value of the mean ( ) and

compare against the observed value of the sample
mean ( )
TwoTail Testing
1.174
Two tail testing is used when we want to test a

research hypothesis that a parameter is not equal
() to some value
Example 11.2
1.175
AT&Ts argues that its rates are such that customers wont
see a difference in their phone bills between them and their
competitors. They calculate the mean and standard
deviation for all their customers at $17.09 and $3.87
(respectively).
They then sample 100 customers at random and recalculate

a monthly phone bill based on competitors rates.
What we want to show is whether or not:

H1: 17.09. We do this by assuming that:
H0: = 17.09
Example 11.2
1.176
The rejection region is set up so we can reject the null

hypothesis when the test statistic is large or when it is small.
stat is small stat is large
That is, we set up a two-tail rejection region. The total area

in the rejection region must sum to , so we divide this
probability by 2.
Example 11.2
1.177
At a 5% significance level (i.e. = .05), we have

/2 = .025. Thus, z.025 = 1.96 and our rejection
region is:
z < 1.96 -or- z > 1.96
z
-z.025 0 +z.025
Example 11.2
1.178
From the data, we calculate = 17.55
Using our standardized test statistic:
We find that:
Since z = 1.19 is not greater than 1.96, nor less than 1.96
we cannot reject the null hypothesis in favor of H1. That is
there is insufficient evidence to infer that there is a difference
between the bills of AT&T and the competitor.
PLOT POWER CURVE
1.179
Summary of One- and Two-Tail Tests
1.180
One-Tail Test Two-Tail Test One-Tail Test

(left tail) (right tail)
Inference About A Population
Population
[SIGMA UNKNOWN]
1.181
Sample
Inference
Statistic
Parameter
We will develop techniques to estimate and test three

population parameters:
Population Mean
Population Variance
Population Proportion p
Inference With Variance Unknown
1.182
Previously, we looked at estimating and testing the

population mean when the population standard
deviation ( ) was known or given:
But how often do we know the actual population

variance?
Instead, we use the Student t-statistic, given by:

Testing when is unknown
1.183
When the population standard deviation is unknown

and the population is normal, the test statistic for
testing hypotheses about is:
which is Student t distributed with = n1 degrees

of freedom. The confidence interval estimator of
is given by:
Example 12.1
1.184
Will new workers achieve 90% of the level of

experienced workers within one week of being
hired and trained?
Experienced workers can process 500

packages/hour, thus if our conjecture is correct, we
expect new workers to be able to process .90(500)
= 450 packages per hour.
Given the data, is this the case?

IDENTIFY
Example 12.1
1.185
Our objective is to describe the population of the

numbers of packages processed in 1 hour by new
workers, that is we want to know whether the new
workers productivity is more than 90% of that of
experienced workers. Thus we have:
H1 : > 450
Therefore we set our usual null hypothesis to:
H0 : = 450
COMPUTE
Example 12.1
1.186
Our test statistic is:
With n=50 data points, we have n1=49 degrees of

freedom. Our hypothesis under question is:
H 1: > 450
Our rejection region becomes:

Thus we will reject the null hypothesis in favor of the

alternative if our calculated test static falls in this region.
COMPUTE
Example 12.1
1.187
From the data, we calculate = 460.38, s =38.83

and thus:
Since
we reject H0 in favor of H1, that is, there is sufficient

evidence to conclude that the new workers are
producing at more than 90% of the average of
experienced workers.
IDENTIFY
Example 12.2
1.188
Can we estimate the return on investment for

companies that won quality awards?
We are given a random sample of n = 83 such

companies. We want to construct a 95% confidence
interval for the mean return, i.e. what is:
??
COMPUTE
Example 12.2
1.189
From the data, we calculate:
For this term
and so:
Check Requisite Conditions
1.190
The Student t distribution is robust, which means that if

the population is nonnormal, the results of the t-test and
confidence interval estimate are still valid provided that
the population is not extremely nonnormal.
To check this requirement, draw a histogram of the data

and see how bell shaped the resulting figure is. If a
histogram is extremely skewed (say in the case of an
exponential distribution), that could be considered
extremely nonnormal and hence t-statistics would be
not be valid in this case.
Inference About Population Variance
1.191
If we are interested in drawing inferences about a

populations variability, the parameter we need to investigate
is the population variance:
The sample variance (s2) is an unbiased, consistent and

efficient point estimator for . Moreover,
the statistic, , has a chi-squared distribution,
with n1 degrees of freedom.

Testing & Estimating Population
1.192
Variance
Combining this statistic:
With the probability statement:
Yields the confidence interval estimator for :
lower confidence limit upper confidence limit

IDENTIFY
Example 12.3
1.193
Consider a container filling machine. Management

wants a machine to fill 1 liter (1,000 ccs) so that that
variance of the fills is less than 1 cc2. A random sample
of n=25 1 liter fills were taken. Does the machine
perform as it should at the 5% significance level?
Variance is less than 1 cc2
We want to show that:
H1 : <1
(so our null hypothesis becomes: H0: = 1). We will
use this test statistic:
COMPUTE
Example 12.3
1.194
Since our alternative hypothesis is phrased as:

H1: <1
We will reject H0 in favor of H1 if our test statistic

falls into this rejection region:
We computer the sample variance to be: s2=.8088

And thus our test statistic takes on this value
Example 12.4
1.195
As we saw, we cannot reject the null hypothesis in

favor of the alternative. That is, there is not enough
evidence to infer that the claim is true.
Note: the result does not say that the variance is
greater than 1, rather it merely states that we are
unable to show that the variance is less than 1.
We could estimate (at 99% confidence say) the

variance of the fills
COMPUTE
Example 12.4
1.196
In order to create a confidence interval estimate of

the variance, we need these formulae:
lower confidence limit upper confidence limit
we know (n1)s2 = 19.41 from our previous

calculation, and we have from Table 5 in Appendix
B:
Comparing Two Populations
1.197
Previously we looked at techniques to estimate and

test parameters for one population:
Population Mean , Population Variance
We will still consider these parameters when we are
looking at two populations, however our interest will
now be:
The difference between two means.
The ratio of two variances.

Difference of Two Means
1.198
In order to test and estimate the difference

between two population means, we draw random
samples from each of two populations. Initially, we
will consider independent samples, that is, samples
that are completely unrelated to one another.
Because we are compare two population means, we

use the statistic:
Sampling Distribution of
1.199
1. is normally distributed if the original

populations are normal or approximately normal if
the populations are nonnormal and the sample sizes are
large (n1, n2 > 30)
2. The expected value of is
3. The variance of is
and the standard error is:

Making Inferences About
1.200
Since is normally distributed if the original

populations are normal or approximately normal if
the populations are nonnormal and the sample sizes are
large (n1, n2 > 30), then:
is a standard normal (or approximately normal)

random variable. We could use this to build test
statistics or confidence interval estimators for

Making Inferences About
1.201
except that, in practice, the z statistic is rarely used

since the population variances are unknown.
??
Instead we use a t-statistic. We consider two cases for

the unknown population variances: when we believe
they are equal and conversely when they are not equal.
When are variances equal?
1.202
How do we know when the population variances

are equal?
Since the population variances are unknown, we

cant know for certain whether theyre equal, but we
can examine the sample variances and informally
judge their relative values to determine whether we
can assume that the population variances are equal
or not.
Test Statistic for (equal
1.203
variances)
1) Calculate the pooled variance estimator as
2) and use it here:
degrees of freedom
CI Estimator for (equal
1.204
variances)
The confidence interval estimator for
when the population variances are equal is given
by:
pooled variance estimator degrees of freedom

Test Statistic for (unequal variances)
1.205
The test statistic for when the population

variances are unequal is given by:
degrees of freedom
Likewise, the confidence interval estimator is:

IDENTIFY
Example 13.2
1.206
Two methods are being tested for assembling office

chairs. Assembly times are recorded (25 times for
each method). At a 5% significance level, do the
assembly times for the two methods differ?
That is, H1:
Hence, our null hypothesis becomes: H0:
Reminder: This is a two-tailed test.

COMPUTE
Example 13.2
1.207
The assembly times for each of the two methods are

recorded and preliminary data is prepared
The sample variances are similar, hence we will assume that the
population variances are equal
COMPUTE
Example 13.2
1.208
Recall, we are doing a two-tailed test, hence the

rejection region will be:
The number of degrees of freedom is:
Hence our critical values of t (and our rejection

region) becomes:
COMPUTE
Example 13.2
1.209
In order to calculate our t-statistic, we need to first

calculate the pooled variance estimator, followed by
the t-statistic
INTERPRET
Example 13.2
1.210
Since our calculated t-statistic does not fall into the

rejection region, we cannot reject H0 in favor of H1, that
is, there is not sufficient evidence to infer that the mean
assembly times differ.
INTERPRET
Example 13.2
1.211
Excel, of course, also provides us with the

information
Compare
or look at p-value
Confidence Interval
1.212
We can compute a 95% confidence interval estimate

for the difference in mean assembly times as:
That is, we estimate the mean difference between the

two assembly methods between .36 and .96 minutes.
Note: zero is included in this confidence interval
Matched Pairs Experiment
1.213
Previously when comparing two populations, we

examined independent samples.
If, however, an observation in one sample is

matched with an observation in a second sample,
this is called a matched pairs experiment.
To help understand this concept, lets consider

example 13.4
Identifying Factors
1.214
Factors that identify the t-test and estimator of :

Inference about the ratio of two
1.215
variances
So far weve looked at comparing measures of central
location, namely the mean of two populations.
When looking at two population variances, we consider the

ratio of the variances, i.e. the parameter of interest to us is:
The sampling statistic: is F distributed with
degrees of freedom.
Inference about the ratio of two
1.216
variances
Our null hypothesis is always:
H0 :
(i.e. the variances of the two populations will be equal,

hence their ratio will be one)
Therefore, our statistic simplifies to:
df1 = n1 - 1
df2 = n2 - 1
IDENTIFY
Example 13.6
1.217
In example 13.1, we looked at the variances of the

samples of people who consumed high fiber cereal and
those who did not and assumed they were not equal.
We can use the ideas just developed to test if this is in
fact the case.
We want to show: H1:

(the variances are not equal to each other)
Hence we have our null hypothesis: H0:

CALCULATE
Example 13.6
1.218
Since our research hypothesis is: H1:

We are doing a two-tailed test, and our rejection
region is:
F
CALCULATE
Example 13.6
1.219
Our test statistic is:
.58 1.61 F
Hence there is sufficient evidence to reject the null

hypothesis in favor of the alternative; that is, there is a
difference in the variance between the two populations.
INTERPRET
Example 13.6
1.220
We may need to work with the Excel output before

drawing conclusions
Our research hypothesis
H1:
requires two-tail testing,
but Excel only gives us values
for one-tail testing
If we double the one-tail p-value Excel gives us, we have the p-value of
the test were conducting (i.e. 2 x 0.0004 = 0.0008). Refer to the text
and CD Appendices for more detail.

What Is Statistics?: "Statistics Is A Way To Get Information From Data"

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What Is Statistics?: "Statistics Is A Way To Get Information From Data"

Uploaded by

Copyright:

Available Formats

What is Statistics?

Statistics is a way to get information from data

Data: Facts, especially Information: Knowledge

Statistics is a tool for creating new understanding from a set of numbers.

Definitions: Oxford English Dictionary

are methods of organizing, summarizing, and

Descriptive Statistics helps to answer these questions

Statistical inference is the process of making an

A variable is a characteristic or condition that can

The entire group of individuals is called the

Usually populations are so large that a researcher

Variables can be classified as discrete or

To define the units for a continuous variable, a

To establish relationships between variables,

1. A nominal scale is an unordered set of

3. An interval scale is an ordered series of equal-sized

The goal of an experiment is to demonstrate a

In an experiment, one variable is manipulated to

A variable is some characteristic of a population or

The values of the variable are the range of possible

Data are the observed values of a variable.

Arithmetic operations can be performed on Interval

Because the numbers are arbitrary arithmetic operations

Nominal data are also called qualitative or categorical.

Ordinal Data appear to be categorical in nature, but their

E.g. College course rating system:

While its still not meaningful to do arithmetic on this data

That is, order is maintained no matter what numeric values

The only allowable calculation on nominal data is to

We can summarize the data in a table that presents

A relative frequency distribution lists the categories

Bar Charts are often used to display frequencies

It all the same information,

There are several graphical methods that are used

The most important of these graphical methods is

The histogram is not only a powerful graphical

1) Collect the Data

Is a graph of a cumulative frequency distribution.

We create an ogive in three steps

last class: .930+.070=1.00

The ogive can be used to

What telephone bill value

Example 2.9 A real estate agent wanted to know

1) Collect the data

It appears that in fact there is a relationship, that is,

Linearity and Direction are two concepts we are

Positive Linear Relationship Negative Linear Relationship

Weak or Non-Linear Relationship

Observations measured at the same point in time

Observations measured at successive points in time

Time-series data graphed on a line chart, which

Measures of Central Location

Measures of Relative Standing

Measures of Linear Relationship

The arithmetic mean, a.k.a. average, shortened to

It is computed by simply adding up all the

is appropriate for describing measurement data,

is seriously affected by extreme values called

Measures of central location fail to tell the whole

But, the red class has greater variability

The range is the simplest measure of variability,

Range = Largest observation Smallest observation

The variance of a sample is: