You are on page 1of 72

Besterfield: Quality Control, 8

th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
GEG202
INTRODUCTORY
STATISTICS
AND ENGINEERING
COMPUTING
INSRUCTORS
AJIBOLA, O. O. E.- PhD Systems Engineering
BALOGUN, O .J Teaching Asst. Systems
Engineering.
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Outline
Introduction
Frequency Analysis
Data collection procedures
Data Reduction techniques
Distributions, Expectation, Dispersion
a) Measure of Central tendency
b) Measure of location
c) Measure of dispersion

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Outline-Continued
Other Measures
Concept of a Population and Sample
The Normal Curve
Tests for Normality
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Learning Objectives
When you have completed this chapter you should
be able to:
Know the difference between a variable and an
attribute.
Perform mathematical calculations to the correct
number of significant figures.
Construct histograms for simple and complex data.

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Learning Objectives-contd.
When you have completed this chapter you should
be able to:
Calculate and effectively use the different measures
of central tendency, dispersion, and their
interrelationship.
Understand the concept of a universe and a sample.
Understand the concept of a normal curve and the
relationship to the mean and standard deviation.

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Learning Objectives-contd.
When you have completed this chapter you
should be able to:
Calculate the percent of items below a value,
above a value, or between two values for data
that are normally distributed.
Calculate the process center given the percent of
items below a value
Perform the different tests of normality
Construct a scatter diagram and perform the
necessary related calculations.

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Definition of Statistics:
1. A collection of quantitative data pertaining to
a subject or group. Examples are blood
pressure statistics etc.
2. The science that deals with the collection,
tabulation, analysis, interpretation, and
presentation of quantitative data
Introduction
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Two phases of statistics:
Descriptive Statistics:
Describes the characteristics of a product or
process using information collected on it.
Inferential Statistics (Inductive):
Draws conclusions on unknown process
parameters based on information contained
in a sample.
Uses probability
Introduction
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Data is the collection of facts in a study.
Primary data are observations collected from source.
Attribute:
Discrete data. Counted data or attribute data.
Examples include:
How many of the products are defective?
How often are the machines repaired?
How many people are absent each day?
Collection of Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Types of Data: Primary, Secondary, Tertiary.
Variable:
Continuous data. Data values can be any
real number. Measured data.
Examples include:
How long is each item?
How long did it take to complete the task?
What is the weight of the product?
Length, volume, time
Collection of Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Frequency Distribution
Measures of Central Tendency
Measures of Dispersion
Analyses of Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Ungrouped Data
Grouped Data
Frequency Distribution
There are three types of frequency distributions
Categorical frequency distributions
Ungrouped frequency distributions
Grouped frequency distributions
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Categorical frequency distributions
Can be used for data that can be placed in
specific categories, such as nominal- or
ordinal-level data.
Examples - political affiliation, religious
affiliation, blood type etc.
Categorical
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2-8
Example :Blood Type
Class Frequency Percent
A 5 20
B 7 28
O 9 36
AB 4 16
Categorical
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2-9
Ungrouped frequency distributions
Ungrouped frequency distributions - can be
used to analyze data that when the total
number of sample points involved is not large.
Examples - number of kilometers your
instructors have to travel from home to
campus, number of girls in a 4-child family etc.
Ungrouped
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2-10
Example :Number of kilometers Traveled
Class Frequency
5 24
10 16
15 10
Ungrouped
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2-11
Grouped frequency distributions
Can be used when the range of values in the
data set is very large. The data is grouped
into manageable number of classes.
Examples - the life of duracell batteries in
hours.
Grouped
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2-12
Example: Lifetimes of Duracell Batteries
Class
limits
Class
Boundaries
Cumulative
24 - 30 23.5 - 37.5 4 4
38 - 51 37.5 - 51.5 14 18
52 - 65 51.5 - 65.5 7 25
frequency
Frequency
Grouped
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Number non
conforming
Frequency Relative
Frequency
Cumulative
Frequency
Relative
Frequency
0 15 0.29 15 0.29
1 20 0.38 35 0.67
2 8 0.15 43 0.83
3 5 0.10 48 0.92
4 3 0.06 51 0.98
5 1 0.02 52 1.00
Table 4-3 Different Frequency Distributions of Data Given in Table 4-1
Frequency Distributions
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Frequency Histogram
0
5
10
15
20
25
0 1 2 3 4 5
Number Nonconforming
F
r
e
q
u
e
n
c
y
Frequency Histogram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Relative Frequency Histogram
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0 1 2 3 4 5
Number Nonconforming
R
e
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y
Relative Frequency Histogram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Cumulative Frequency Histogram
0
10
20
30
40
50
60
0 1 2 3 4 5
Number Nonconforming
C
u
m
u
l
a
t
i
v
e

F
r
e
q
u
e
n
c
y
Cumulative Frequency Histogram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
The histogram is the most important graphical tool
for exploring the shape of data distributions.
The Histogram
Step 1: Find range of distribution, largest -
smallest values
Step 2: Choose number of classes, 5 to 12
Step 3: Determine width of classes,
Step 4: Determine class boundaries
Step 5: Draw frequency histogram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Bar Graph
Polygon of Data
Cumulative Frequency Distribution or Ogive
Other Types of
Frequency Distribution Graphs
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Bar Graph and Polygon of Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Cumulative Frequency
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Figure 4-6 Characteristics of frequency distributions
Characteristics of Frequency
Distribution Graphs
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Analysis of Histograms
Figure 4-7 Differences due to location, spread, and shape
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Analysis of Histograms
Figure 4-8 Histogram of Wash Concentration
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
The three measures in common use are the:
Average
Median
Mode
Measures of Central Tendency
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
1
n
i
i
X
X
n
=
=

Average-Ungrouped Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
1
1 1 2 2
1 2
... .
...
h
i i
i
h h
h
f X
X
n
f X f X f X
f f f
=
=

+ +
=
+ +

h = number of cells fi=frequency


Xi=midpoint
Average-Grouped Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
1
1
n
i i
i
w
n
i
i
w X
X
w
=
=
=

Used when a number of averages are


combined with different frequencies
Average-Weighted Average
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2
m
d m
m
n
cf
M L i
f
(

(
= +
(
(

Lm=lower boundary of the cell with the median
N=total number of observations
Cfm=cumulative frequency of all cells below m
Fm=frequency of median cell
i=cell interval
Median-Grouped Data
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Boundaries Midpoint Frequency Computation
23.6-26.5 25.0 4 100
26.6-29.5 28.0 36 1008
29.6-32.5 31.0 51 1581
32.6-35.5 34.0 63 2142
35.6-38.5 37.0 58 2146
38.6-41.5 40.0 52 2080
41.6-44.5 43.0 34 1462
44.6-47.5 46.0 16 736
47.6-50.5 49.0 6 294
Total 320 11549
Table 4-7 Frequency Distribution of the Life of 320 tires in 1000 km
Example Problem
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2
m
d m
m
n
cf
M L i
f
(

(
= +
(
(

320
154
2
35.6 3 35.9
58
Md
(

(
= + =
(
(

Median-Grouped Data
Using data from Table 4-7
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Mode
The Mode is the value that occurs with the
greatest frequency.

It is possible to have no modes in a series or
numbers or to have more than one mode.
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Figure 4-9 Relationship among average, median and mode
Relationship Among the
Measures of Central Tendency
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Range
Standard Deviation
Variance
Measures of Dispersion
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
The range is the simplest and easiest to
calculate of the measures of dispersion.
Range = R = Xh - Xl
Largest value - Smallest value in data
set
Measures of Dispersion-Range
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Sample Standard Deviation:
2
1
( )
1
n
i
Xi X
S
n
=

=

2
2
1
1
/
1
n
n
i
i
Xi Xi n
S
n
=
=
| |

|
\ .
=


Measures of Dispersion-Standard
Deviation
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Ungrouped Technique
2 2
1 1
( )
( 1)
n n
i i
n Xi Xi
S
n n
= =


Standard Deviation
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
2 2
1
1
( ) ( )
( 1)
h
h
i i i i
i
i
n f X f X
s
n n
=
=


Standard Deviation
Grouped Technique
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Relationship Between the
Measures of Dispersion
As n increases, accuracy of R decreases
Use R when there is small amount of data or data
is too scattered
If n> 10 use standard deviation
A smaller standard deviation means better quality

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Relationship Between the
Measures of Dispersion
Figure 4-10 Comparison of two distributions with equal average and range
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Other Measures
There are three other measures that are
frequently used to analyze a collection of data:
Skewness
Kurtosis
Coefficient of Variation

Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Skewness is the lack of symmetry of the data.
For grouped data:
3
1
3
3
( ) /
h
i i
i
f X X n
a
s
=

=

Skewness
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved

Skewness
Figure 4-11 Left (negative) and right (positive) skewness distributions
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Kurtosis provides information regrading the shape
of the population distribution (the peakedness or
heaviness of the tails of a distribution).
For grouped data:
4
1
4
4
( ) /
h
i i
i
f X X n
a
s
=

=

Kurtosis
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved

Kurtosis
Figure 4-11 Leptokurtic and Platykurtic distributions
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Correlation variation (CV) is a measure of how
much variation exists in relation to the mean.

Coefficient of Variation
(100%) s
CV
X
=
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Population
Set of all items that possess a
characteristic of interest

Sample
Subset of a population
Population and Sample
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Parameter is a characteristic of a population, i.o.w. it
describes a population
Example: average weight of the population, e.g.
50,000 cans made in a month.
Statistic is a characteristic of a sample, used to
make inferences on the population parameters that
are typically unknown, called an estimator
Example: average weight of a sample of 500 cans
from that months output, an estimate of the average
weight of the 50,000 cans.
Parameter and Statistic
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Characteristics of the normal curve:
It is symmetrical -- Half the cases are to one
side of the center; the other half is on the
other side.
The distribution is single peaked, not bimodal
or multi-modal
Also known as the Gaussian distribution
The Normal Curve
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Characteristics:
Most of the cases will fall in the center portion of
the curve and as values of the variable become
more extreme they become less frequent, with
"outliers" at the "tail" of the distribution few in
number. It is one of many frequency distributions.


The Normal Curve
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
The standard normal distribution is a normal
distribution with a mean of 0 and a standard deviation
of 1. Normal distributions can be transformed to
standard normal distributions by the formula:

i
X
Z

o

=
Standard Normal Distribution
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved

Relationship between the Mean
and Standard Deviation
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Mean and Standard Deviation
Same mean but different standard deviation
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Mean and Standard Deviation
Same mean but different standard deviation
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved

IF THE DISTRIBUTION IS NORMAL
Then the mean is the best measure of
central tendency
Most scores bunched up in middle
Extreme scores are less frequent,
therefore less probable

Normal Distribution
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Percent of items included between certain values of the std. deviation
Normal Distribution
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Histogram
Skewness
Kurtosis
Tests for Normality
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Histogram:
Shape
Symmetrical
The larger the sampler size, the better the
judgment of normality. A minimum sample size of
50 is recommended
Tests for Normality
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Skewness (a3) and Kurtosis (a4)
Skewed to the left or to the right (a3=0 for a
normal distribution)
The data are peaked as the normal
distribution (a4=3 for a normal distribution)
The larger the sample size, the better the
judgment of normality (sample size of 100 is
recommended)
Tests for Normality
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Probability Plots
Order the data from the smallest to the largest
Rank the observations (starting from 1 for the
lowest observation)
Calculate the plotting position
100( 0.5) i
PP
n

=
Where i = rank PP=plotting position n=sample size
Tests for Normality
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Procedure:
Order the data
Rank the observations
Calculate the plotting position

Probability Plots
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Procedure contd:
Label the data scale
Plot the points
Attempt to fit by eye a best line
Determine normality

Probability Plots
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Procedure contd:
Order the data
Rank the observations
Calculate the plotting position
Label the data scale
Plot the points
Attempt to fit by eye a best line
Determine normality

Probability Plots
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Chi-Square Test
2
Chi-squared
Observed value in a cell
Expected value for a cell
i
i
O
E
_ =
=
=
Where
2
2
1
( ) i
k
i
i i
O E
E
_
=

=

Chi-Square Goodness of Fit Test
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
The simplest way to determine if a cause
and-effect relationship exists between two
variables
Scatter Diagram
Figure 4-19 Scatter Diagram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Supplies the data to confirm a hypothesis that
two variables are related
Provides both a visual and statistical means
to test the strength of a relationship
Provides a good follow-up to cause and effect
diagrams
Scatter Diagram
Besterfield: Quality Control, 8
th
ed.. 2009 Pearson Education, Upper Saddle River, NJ 07458.
All rights reserved
Straight Line Fit
2 2
[( )( ) /
[( ) / ]
/ ( / )
xy x y n
m
x x n
a y n m x n
y a mx

=
= +



Where m=slope of the line and a is the intercept on the y axis

You might also like