Professional Documents
Culture Documents
Lesson 4
Section 1.3
Study Beginnings
1
Populations and Samples
The population is the complete collection of
individuals or objects that you wish to learn about.
2
Parameters and Statistics
A parameter is a value (usually a proportion or
average) that describes the population.
3
Example 1
Define the population, sample, parameter, and
statistic from the following study:
4
Example 2
Define the population, sample, parameter, and
statistic from the following study:
5
Research Questions
The first step in conducting research is to identify
topics or questions that are to be investigated.
6
Research Questions
A research question should refer to a target
population. Often times, however, it is too
expensive or difficult to collect data for every case
in a population. Instead, a sample is taken.
8
Example 4
Consider the following research question: "Does a
new drug reduce the number of deaths in patients
with severe heart disease?"
9
Anecdotal Evidence
Both of the conclusions of the last two examples
were based on some data. However, there were
two problems.
First, the data only represent one or two cases.
Second, it is unclear whether these cases are
actually representative of the population.
Data collected in this haphazard fashion are called
anecdotal evidence.
10
Anecdotal Evidence
When anecdotal
evidence is cited, there
is no reason to expect
the individuals to be
representative of
anyone but themselves.
They can make nice
stories, but lousy
statistics.
11
Bias
If someone was permitted to pick and choose
exactly which cases were included in a sample, it
is entirely possible that the sample could be
skewed to that person's interests. This introduces
bias into a sample.
12
An Example of Bad Data
The 1936 presidential election
between Franklin Roosevelt
and Alf Landon is notable for
the Literary Digest poll, which
was based on over two million
returned postcards.
In its October 31 issue, Landon
was predicted to easily win with
370 electoral votes and 57% of
the popular vote.
13
1936 Election Results
Landon's electoral vote total of eight is a tie for the
record low for a major-party nominee since the
current U.S. two-party system began in the 1850s.
The Literary Digest was completely discredited
because of the poll and was soon discontinued.
14
Why did the Literary Digest fail?
15
Why did the Literary Digest fail?
16
Bias
In general, there are three common types of bias that
might occur in a sample:
Selection bias: The method for selection makes
the sample unrepresentative of the population.
Nonresponse bias: A sample is chosen, but a
subset cannot or will not respond.
Response bias: Participants to a survey provide
incorrect information, intentionally or unintentionally.
17
Bias
Bias is the bane of sampling the one thing above
all to avoid.
18
Example 5
Indicate whether the potential bias is a selection
bias, a nonresponse bias, or a response bias.
19
Example 6
For each situation, explain why selection bias could be
introduced, and how it could affect your results.
a) A cage has 1000 rats, you pick the first 20 you can
catch for your experiment.
b) A public opinion poll is conducted using the
telephone directory.
c) You are conducting a study of a new diabetes drug;
you advertise for participants in the newspaper and
TV.
Example 7
You need to conduct a study of longevity for
people who were born in the decade following the
end of World War II in 1945. If you were to visit
graveyards and use only the birth/death dates
listed on tombstones, would you get good results?
Why or why not?
Example 8
"If you had to do it over again, would you have
children?" This is the question that advice columnist
Ann Landers asked her readers back in 1976. It turns
out that nearly 70% of the 10,000 responses she
received were "No." A professional poll by Newsday
found that 91% of randomly chosen respondents
would have children again.
22
Types of Variables
In many studies more than one variable is
recorded per case or individual.
It is often the purpose of a study to determine if
and/or how one variable (called the explanatory
variable) affects another (called the response
variable).
23
Types of Variables
Response Variable: The outcome of a study. A
variable you would be interested in predicting or
forecasting.
Explanatory Variable: Any variable that explains
the response variable.
24
Example 9
Pick out which variable you think should be the
explanatory variable and which variable should be the
response.
a) Weights of nuggets of gold (in ounces) and their
market value (in $) over the last few days are
provided, and you wish to use this to estimate the
value of a gold ring that weighs 4 ounces.
25
Example 9 continued
b) You have data collected on the amount of time
since chlorine was added to the public swimming
pool and the concentration of chlorine still in the
pool. Chlorine was added at 8 AM, and you wish to
know what the concentration is now, at 3 PM.
c) You have data on the circumference of oak trees
(measured 12 inches from the ground) and their
age (in years). An oak tree in the park has a
circumference of 36 inches, and you wish to know
approximately how old it is.
26
Example 10
Suppose your wanted to conduct a study to predict
a student's success. Using a student's GPA as the
response variable, what are some explanatory
variables that might be worth considering.
Determine the variable type (categorical,
numerical) of each explanatory variable. For each
numerical explanatory variable, guess whether the
association with the response will be positive,
negative, or none.
27