You are on page 1of 17

QA 605

WINTER QUARTER
2006-2007 ACADEMIC YEAR
Instructor: James J. Cochran
Office: 117A CAB
Telephone: (318) 257-3445
Hours: Tuesday & Thursday 8:00 a.m. – 10:00 a.m.
Wednesday 8:00 a.m. – noon &
1:00 p.m. - 3:00 p.m.
or by appointment
e-mail: jcochran@cab.latech.edu
URL: http://www.cab.latech.edu/jcochran/index.htm

Textbooks:
- Regression Draper & Smith, Applied Regression
Analysis Analysis, Wiley-Interscience, 1998
(3rd edition)

- Experimental Dean & Voss, Design and Analysis of


Design Experiments, Springer-Verlag, 1999
(1st edition)

- Multivariate Johnson & Wichern, Applied


Statistics Multivariate Statistical Analysis,
Prentice-Hall, 2001 (5th edition)
Prerequisites: MATH 125 (college algebra)
QA 390 (calculus & linear/matrix algebra)
QA 622 (Statistics for Graduate Business
Studies I)
computer literacy (ability to learn and use SAS)
Grading: Literature Review 150 Points
Midterm Exam 150 points
Comprehensive Final Exam 200 points
500 points

I. Introduction to Design and


Analysis of Experiments
A.Methods of Data Collection

Observation – Selection of a proportion of the population


and measurement or observation of the values of the
variables in question for the selected elements

Experimentation - Manipulation of the values (or levels)


of one or more (independent) variables or treatments and
observation of the corresponding change in the values of
one or more (dependent) variables or responses
B. Why experiment?
1. To determine the cause(s) of variation in the response

2. To find conditions under which the optimal


(maximum or minimum) response is achieved

3. To compare responses at different levels of


controllable variables

4. To develop a model for predicting responses

C. Basic definitions
1. Treatments – different combinations of conditions
that we wish to test

2. Treatment Levels – the relative intensities at which a


treatment will be set during the experiment

3. Treatment Factor (or Factor) – one of the controlled


conditions of the experiment (these combine to form
the treatments)

4. Experimental Unit – subject on which a treatment will


be applied and from which a response will be elicited
– also called measurement or response units
5. Responses – outcomes that will be elicited from
experimental units after treatments have been applied

6. Design of Experiments (DOE) – also referred to as


Experimental Design, this is the study of planning
efficient and systematic collection of responses from
experimental units

7. Experimental Design – rule for assigning treatment


levels to experimental units

8. Analysis of Variance (ANOVA) – principal statistical


means for evaluating potential sources of variation in
the responses

9. Replication – observing individual responses of


multiple experimental units under identical
experimental conditions

10. Repeated Measurements – observing multiple


responses of a single experimental unit under
identical experimental conditions

11. Blocking – partition the experimental units into


groups (or blocks) that are homogeneous in some
sense

12. Covariate –additional responses collected from the


experimental units, usually to be used as predictors
(and so are sometimes called predictive responses) –
these are not part of a designed experiment (why?)
13. Randomization – nonsystematic assignment of
experimental units to treatments

14. Blinding – hiding which experimental units have been


assigned to treatments from the analyst

15. Confounding – design situation in which the effect of


one factor or treatment can not be distinguished from
another factor or treatment - this is the experimental
equivalent of prefect multicolinearity (why?)

D. What characterizes an experiment?


1. The treatments to be used

2. The experimental units to be used

3. The way that treatments levels are assigned to


experimental units (or visa-versa)

4. The responses that are measured


Heisenberg Uncertainty Principle - in terms of momentum and
position in a Quantum Mechanical world, a particle has
momentum p and a position x that can never be measured
precisely.
• There is an uncertainty associated with each measurement,
e.g., there is some dp and dx, which you can never
eliminate (even in a perfect experiment)!
• This occurs because whenever you make a measurement,
you must disturb the system. (In order to know something
is there, you must bump into it in some sense.) The size of
the uncertainties are not independent; they are roughly
related by
(uncertainty in p) x (uncertainty in position) > h (= Planck's constant)

Is there a Heisenberg Uncertainty Phenomena in behavioral


research?

E. What characterizes comprise a good experimental


design?
1. It avoids systematic error – systematic error leads to
bias when estimating differences in responses
between (i.e., comparing) treatments

2. It allows for precise estimation – achieves a relatively


small random error, which in turn depends on
• the random error in the responses
• the number of experimental units
• The experimental design employed

3. It allows for proper estimation of error

4. It has broad validity – the experimental units are a


sample of the population in question
F. Dean & Voss’ Checklist for Planning an Experiment
1. Carefully define the objectives of the experiment –
make a list of all essential precise issues to be
addressed by the experiment

2. Identify all potential sources of variation in the


responses, including
• treatment factors and their levels - identify each as
o a major source (or treatment factor) or minor
source (or nuisance or noise factor)
o A controllable or uncontrollable source
• experimental units – these should be
representative of the population (materials and
conditions)to which the conclusions of the
experiment will be applied.
• blocking factors, noise factors, and covariates

3. Choose a scheme for assigning experimental units to


treatment levels (i.e., select an experimental design)

4. Specify the experimental process - identify the


• response(s) to be measured (including any
covariates)
o a major source (or treatment factor) or minor
source (or nuisance or noise factor)
o a controllable or uncontrollable source
• the experimental procedure – how will the
experiment be administered?
• any anticipated difficulties in
o achieving/maintaining treatment levels
o obtaining responses
5. Conduct a pilot study
• collect only a few observations
• evaluate the variation (this information will be
used to estimate the required sample size)
• Reevaluate your decisions on Steps 1 - 4

6. Specify the hypothesized model


• linear vs. nonlinear
• fixed, random, or mixed effects
• response(s) to be measured (including any
covariates)

7. Outline the analyses to be conducted


• to achieve objectives from Step 1
• using the design selected in Step 3
• using the model specified in Step 6

8. Estimate the required sample size using results from


the pilot study

9. Review your decisions in Steps 1 – 8 and make


necessary revisions
G. An Example Experiment
1. Suppose you are employed as a statistical consultant
for a packaged goods manufacturer that is
experiencing wide fluctuations in consumer
satisfaction with the crunchiness of their corn-flake
cereal. The objective of an appropriate experiment
may be to determine the source(s) of this volatility.

2. Potential sources of variation in the response


(consumer rating of the corn flake’s crunchiness)
include
• proportion of liquid ingredients in recipe (30% vs.
40%)
• proportion of fat in recipe (15% vs. 20%)
• baking temperature (350o vs. 400o Fahrenheit)
• baking time (10 minutes vs. 12 minutes)

3. A randomized 24 complete block design (to be


discussed later) is chosen.

4. In the experimental process we will


• Bake batches corn flakes under various conditions
and measure consumer ratings on these batches of
corn flakes. The hypothesized major sources of
variation in responses include:
o proportion of liquid ingredients in recipe
o proportion of fat in recipe
o baking temperature
o baking time
These are all controllable. In addition, a minor
source of variation might be the source of corn
flour used in the batches of corn flakes.
Note that we must be able to control the baking
temperature in order to execute this experiment.
5. For our small pilot study our results might look like
this:

Proportion of Water in Recipe


30% 40%
Proportion of Fat in Recipe Proportion of Fat in Recipe
15% 20% 15% 20%
Baking Temperature Baking Temperature Baking Temperature Baking Temperature
350o 400o 350o 400o 350o 400o 350o 400o
Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time
10 12 10 12 10 12 10 12 10 12 10 12 10 12 10 12
minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx

Note that corn flour from Supplier A was used in


baking batches that resulted in the blue observations
while corn flour from Supplier B was used in baking
batches that resulted in the green observations

In the data from the pilot study we may encounter


something like this:
Consumer
Crunchiness
Rating

350o 400o Temperature


Note that :
- batches baked at the higher temperature consistently
yield a crunchier corn flake
- there appears to be little difference in crunchiness
between batches baked with corn flour from Supplier
A and batches baked with corn flour from Supplier B
or this:

Consumer
Crunchiness
Rating

Supplier Effect

350o 400o Temperature


Note that :
- there appears to be little difference in crunchiness
between batches baked the different temperatures
- batches baked with corn flour from Supplier A
consistently yield a crunchier corn flake than batches
baked with corn flour from Supplier B

or this:

Consumer
Crunchiness
Rating

Supplier Effect

Temperature Effect

350o 400o Temperature


Note that:
- batches baked at the higher temperature consistently
yield a crunchier corn flake
- batches baked with corn flour from Supplier A
consistently yield a crunchier corn flake than batches
baked with corn flour from Supplier B
or this:

Consumer
Crunchiness
Rating

350o 400o Temperature


Note that batches baked with corn flour from Supplier
A yields a crunchier corn flake at a higher baking
temperature while batches baked with corn flour from
Supplier B yields a less crunchy corn flake at a higher
baking temperature – this suggests an interaction
between baking temperature and corn flour supplier.

6. We hypothesized a linear model of the form

Crunchiness
Rating = Constant +
Proportion of Water Effect +
Proportion of Fat Effect +
Baking Temperature Effect +
Baking Time Effect +
Block (Corn Flour Provider) Effect +
Temperature x Block Interaction +
Error
7. We will use our original design.

8. We wish to make four replications per treatment (128


total batches):
Proportion of Water in Recipe
30% 40%
Proportion of Fat in Recipe Proportion of Fat in Recipe
15% 20% 15% 20%
Baking Temperature Baking Temperature Baking Temperature Baking Temperature
350o 400o 350o 400o 350o 400o 350o 400o
Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time Baking Time
10 12 10 12 10 12 10 12 10 12 10 12 10 12 10 12
minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes minutes
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx
xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx xxx

9. We have found that we do not have enough corn flour


from Supplier A to run four replications, so we will
reduce our experiment to two replications per
treatment.

H. Some Standard Experimental Designs


1. Completely Randomized Designs (CRD) – assignment
of experimental units to the treatments is completely
random. Note that:
- these designs are used when no blocking factors are
involved
- the statistical properties of these designs are
determined entirely by the number of observations in
each treatment (r1, r2,…, rν, where ri represents the
number of observations on the ith treatment)
- the model is of the form
Response = constant + treatment effect + error
2. Block Designs - experimental units are partitioned into
blocks and then assigned randomly to the treatments
within each block. Note that:
- these designs are used when experimental units can
be partitioned into groups (or blocks) that are
homogeneous in some meaningful sense
- in a Complete Block Design (CBD) each treatment is
observed the same number of times in each block
- in a Randomized Complete Block Design or
Randomized Block Design (RBD) each treatment is
observed once in each block
- in an Incomplete Block Design (IBD) the block size is
smaller than the number treatments
- the model is of the form
Response = constant + block effect + treatment effect + error

3. Designs with Multiple Blocking Factors – designs


that are said to be either crossed or nested:
- in Crossed Block Designs (or Row-Column Designs
when two blocking factors are involved) each
combination of levels for the blocks is used, i.e., for a
two block design in which one block has three levels
and the other block has two levels:
Block A
Level 1 Level 2 Level 3
Level 1
Block B
Level 2
cells
Note that:
o these designs allow for estimation of main effects
and interactions
o the model is of the form
Response = constant + row block effect + column
block effect + treatment effect + error
- in Nested Block Designs (or Hierarchical Designs)
various levels of one blocking factor appear in
combination with only one level of another blocking
factor, i.e., for a two block design in which one block
has three levels and the other block has six levels we
may have: Block A
Level 1 Level 2 Level 3
Level 1
Level 2
Level 3
Block B
Level 4
Level 5
Level 6
Note that: cells

o these designs allow only for estimation of main


effects (no interactions)
o the model is of the form
Response = constant + row block effect + column
block effect + treatment effect + error

4. Split-Plot Designs – The split-plot design involves two


experimental factors, A and B. Levels of A are
randomly assigned to whole plots (main plots), and
levels of B are randomly assigned to split plots
(subplots) within each whole plot. The subplots are
assumed to be nested within the whole plots so that a
whole plot consists of a cluster of subplots and a level
of A is applied to the entire cluster. The design
provides more precise information about B than about
A, and it often arises when A can be applied only to
large experimental units.
Example: Suppose our response is crop growth, factor
A (the whole plot factor) represents irrigation levels for
large plots of land, and factor B (the subplot) represents
different fertilizers applied within each large plot of
land. The levels of B are randomly assigned to split
plots (subplots) within each whole plot.
Factor A (Irrigation Level)
Level 1 Level 2
Fertilizer B Fertilizer C
Fertilizer D Fertilizer A
Factor B Fertilizer A Fertilizer B
(fertilizer)
Fertilizer C Fertilizer D

Notice that the farmland has been divided into two


whole plots to which levels of factor A (Irrigation)
have been randomly assigned. Each of the whole plots
have then been subdivided into split plots to which
levels of factor B (fertilizer) have been randomly
assigned. Note that:
- there have been two separate, independent
randomizations in this experiment
- we will obtain more precise information about the
fertilizer effect (factor B) than the irrigation effect
(factor A) from this experiment

Note also that split-plot designs are useful when


- some of the factors of interest may be difficult to vary'
while the remaining factors are easy to vary. As a
result, the order in which the treatment combinations
for the experiment are run are 'ordered' by these
‘difficult to vary' factors, or
- experimental units are processed together as a batch
for one or more of the factors in a particular treatment
combination, or
- experimental units are processed individually one
right after the other for the same treatment
combination without resetting the factor settings for
that treatment combination
SOME questions you should be able to answer:
1. How do experiments differ from observational studies? How
does this impact conclusions that can be drawn from
research? What characterizes an experiment?
2. Why is experimentation an important research tool? What
are potential uses of experimentation in your research
discipline?
3. What is meant by the terms treatment, treatment level, factor,
experimental unit, response, design of experiments,
experimental design, replication, repeated measurements,
blocking, covariate, randomization, blinding, and
confounding?
4. Describe Dean & Voss’ checklist for planning an experiment.
Explain the rationale behind this approach.

SOME questions you should be able to answer:


5. Explain each of the following standard experimental
designs:
a. The Completely Randomized Design
b. The Block Design
c. The Crossed Block Design
d. The Nested Block Design
e. The Split-Plot Design
Under what conditions is the use of each of these designs
most appropriate?

You might also like