You are on page 1of 22

Chapter 1 (Sections 1.1-1.

5)
Statistics, Data, and Statistical Thinking

What is Statistics?
C
h
a
p
t
e
r

Statistics is a mathematical
science, but not a branch of
mathematics.
Statistics is the

Science of Data.

1
2

What is Statistics?
C
h
a
p
t
e
r

Statistics is a science that


deals with the study of
collecting,
classifying,
summarizing,
organizing,
analyzing,
interpreting, and

presenting data.
3

C
h
a
p
t
e
r

Descriptive
Statistics

Statistics
Inferential
Statistics

1
4

C
h
a
p
t
e
r

Descriptive
Statistics

organize, summarize,
and present data.

Inferential
Statistics

make reasonable guesses


about population
characteristics using
sample data

1
5

Terminology
C
h
a
p
t
e
r

Individual or Unit or Case or Experimental Unit


objects being described by a set of data
(example: person, household, car, animal, corn, etc.)
Population
the entire group of individuals about which we want
information (example: all students at IUPUI)
Note: Population size is denoted by N.
Sample
a representative part of the population
(example: a representative group of students at IUPUI)
Note: Sample size is denoted by n.
Variables
characteristics of the individuals that can be measured
(example: height, yield, length, age, eye color, etc.)
6

C
h
a
p
t
e
r

Variables

Categorical/Qualitative

Numerical/Quantitative

Discrete

Continuous

1
7

More Terminology
C
h
a
p
t
e
r

Categorical or Qualitative variables


Variables that record a response as a member of a category
(example: race, religion, hair color, etc.)
Numerical or Quantitative Variables
Variables that are measured in terms of numbers.
(example: age, height, weight, etc.)
Discrete Variables
Variables that must be a whole number (integer).
(example: number of babies born in a hospital per day, number
of wires in a cable, etc.)
Continuous variables
Variable that can take on any of a range of values
(example: distance between two towns, age, weight, etc.)
8

Methods of Collecting Data


C
h
a
p
t
e
r

Published Source
The data set of interest has already been collected and is
available to the public. Examples include:

Statistical Abstract of the United States A


comprehensive summary of statistics on the social,
political, and economic organization of the United
States (yearly)

Survey of Current Business Data on the economy of


the United States (monthly)

The Wall Street Journal Financial data

The Sporting News Sports information

Census Information is obtained from the whole


population

Methods of Collecting Data


C
h
a
p
t
e
r

Designed Experimental Study

A study is an experiment if the investigator observes how a


response variable behaves when the researcher
manipulates one or more explanatory variables or factors.

The goal of an experiment is to determine the effect of the


manipulated factors (the different levels) on the response
variable.

In a well-designed experiment, the composition of the


groups that will be exposed to different experimental
conditions (treatments) is determined by random
assignment.

Experimental studies actively produce data.

10

Methods of Collecting Data


C
h
a
p
t
e
r

Observational Study

A study is an observational study if the investigator


observes characteristics of a subset of the members of one
or more existing populations.

The goal of an observational study is usually to draw


conclusions about the corresponding population or about
differences between two or more populations.

Researcher must obtain a sample that is representative of


the population of interest. This is best accomplished
through some well designed random sampling procedure.

Observational studies passively collect data.

1
11

Sampling
C
h
a
p
t
e
r

Information is obtained from a small


group (sample) of objects/individuals
taken from the population.
The sample should be a representative
sample, that is, it should reflect as
closely as possible the relevant
characteristics of the population under
consideration.

1
12

Sampling
C
h
a
p
t
e
r

Simple Random Sampling


a sampling procedure for which each possible sample of a given
size is equally likely to be the one obtained. A sample obtained
in this way is called a Simple Random Sample (SRS). This
procedure can be implemented in two ways:
Mechanical methods
Thoroughly mix symmetrical items in a physical randomizing
device. (example: drawing slips of paper from a hat)
Methods using Random Numbers
Create a list, called a sampling frame, of all the objects or
individuals in the population. Assign each item a number, and a
table of random numbers generator can be used to select the
sample.

13

Sampling
C
h
a
p
t
e
r

Systematic Random Sampling

Divide a population list into as many consecutive segments


as you need.

Randomly choose a starting point in the first segment.

Sample at that same point in each segment.

Segment 1

Segment 2

Segment 3

14

Sampling
C
h
a
p
t
e
r

Stratified Random Sampling


Data is divided into subgroups
(strata)

Strata are based specific


characteristic
o

Age

Education level

Etc.

Use simple random sampling


within each strata

1
15

Sampling
C
h
a
p
t
e
r

Cluster Random Sampling

Data is divided into clusters


Usually geographic
Random sampling used to choose clusters
All data used from selected clusters

1
16

Sampling
C
h
a
p
t
e
r

Stratified Random
Sampling

Cluster Random
Sampling

1
17

Sampling
C
h
a
p
t
e
r

Convenience Sampling

Data are used from population members that


are readily available

In general, not a good sampling technique.


BE WARY OF BIAS!

1
18

Sampling
C
h
a
p
t
e
r

Examples
In a class of 18 students, 6 are chosen for an assignment
Sampling Type

Example

Random

Pull 6 names out of a hat

Systematic

Selecting every 3rd student

Stratified

Divide the class into 2 equal age groups.


Randomly choose 3 from each group

Cluster

Divide the class into 6 groups of 3 students


each. Randomly choose 2 groups

Convenience

Take the 6 students closest to the teacher

19
19

What is Bias?
C
h
a
p
t
e
r

Bias in sampling is the tendency for samples to


differ from the corresponding population in some
systematic way.
Bias can occur because of the method used to
select a sample (selection bias).
Bias can also occur because of the way the data is
collected after the sample has been selected (nonresponse or measurement bias).

1
20

What is Bias?
C
h
a
p
t
e
r

Selection Bias
Occurs when the way the sample is
selected systematically excludes
some part of the population of
interest.
If the members of the population
included in the sample differ from
the excluded members on a variable
that is important to the study,
conclusions based on the sample data
may not be valid for the population of
interest.

1
21

What is Bias?
C
h
a
p
t
e
r

Non-Response Bias
Occurs when responses are not obtained from all
individuals selected for inclusion in the sample. This type
of bias can distort results if those who respond differ in
important ways from those who do not respond

Measurement Bias (Error)


Refers to inaccuracies in the values of the data recorded.
Occurs when the method of observation tends to produce
values that systematically differ from the true value in
some way.

1
22

You might also like