Lecture 3: Measured data and Statistics
Introduces the concept of measurement
variation and statistical methods for
measuring and describing variation
Compiled by Ramdziah Md.Nasir
Recap Lecture 2: TQM
The Road to Business Growth
Business Growth
Continuous Improvement
Supplier Partnering
Employee Involvement
Focus on Quality
Leadership
Process Orientation
Customer Satisfaction
Clear Vision
TQM Implementation
Begins with the Senior Managements and the CEOs
commitment
Involvement is required
Requires the education of Senior Management in TQM
concepts
Timing of the implementation process can be very
important
Formation of the Quality Council
Development of Core Values, Vision Statement,
Mission Statement, Quality Policy Statement
3
TQM Implementation
Quality Council:
Composed of: CEO, the Senior
Managers of the functional areas, such
as design, marketing, finance,
production, and quality; and a
coordinator or consultant
The coordinator will ensure that the team
members are empowered and know their
responsibilities
TQM Implementation
Quality Council Duties:
1.
Develop the core values, vision,
mission, and quality policy statements
2.
Develop the strategic longterm plan
with goals and the annual quality
improvement program with objectives
3.
Create the total education and training
plan
4.
Determine and continually monitor the
cost of poor quality
TQM Implementation
Quality Council Duties:
5.
Determine the performance measures for
the organization
6.
Determine projects that improve the
processes
7.
Establish multifunctional project and
departmental or work group team
8.
Establish or revise the recognition and
reward system
TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award:
Visionary Leadership
Customerdriven Excellence
Organizational & Personal Learning
Valuing Employees & Partners
TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award Contd:
Agility
Focus on the Future
Management for Innovation
Management by Fact
TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award Contd:
Social Responsibility
Focus on Results and Creating Value
Systems Perspective
TQM Implementation
Quality Statements:
Include the Vision Statement, Mission
Statement, and Quality Policy
Statement
They are part of the strategy planning
process, which includes goals and
objectives
Develop with input from all personnel
10
TQM Implementation
Seven Steps to Strategy Planning:
Customer Needs
Customer Positioning
Predict the Future
Gap Analysis
Closing the Gap
Alignment
Implementation
11
Continuous Process
Improvement/Development (CPD)
FEEDBACK
INPUT
Materials
Money
Information
Data, etc
PROCESS
People
Equipment
Method
Procedures
Environment
Materials
OUTPUT
Information
Data
Product
Service, etc.
CONDITIONS
Figure 23 Input/output process model
OUTCOMES
Taguchis Loss Function
L = D2C
High loss
where
L = loss to society
D = distance from
target value
C = cost of deviation
Unacceptable
Loss (to
producing
organization,
customer,
and society)
Poor
Good
Best
Low loss
Targetoriented quality
yields more product in
the best category
Targetoriented quality
brings product toward
the target value
Frequency
Conformanceoriented
quality keeps products
within 3 standard
deviations
Lower
Target
Specification
Upper
Learning Objectives
Understand the errors related to measurement
Know the roundoff rules
Able to distinguish between two types of variations
special cause and common cause
Know what statistic is and its applications
Know what distributions are and how they are used in
SPC
Able to calculate the mean, median, mode, range and
standard deviation for a set of numbers
Able to draw a histogram for a set of numbers
Measurement
Any measurement is only as good as the
measuring device/technique or the persons
using it.
Measurement error always exist, measured
value is an estimation.
Accuracy is the smallest unit on the measuring
device
Maximum error of a measurement is half the
accuracy
Distribution is an ordered set of numbers that are
grouped in some manners. The distribution
maybe in a table, graph or picture form
Example:
2.34cm is accurate to the nearest
hundredth of a cm (0.01), therefore
maximum error is 0.01/2 =0.005
When a measurement is written, its
accuracy is implied by the number of
place value, e.g. 2.34cm and 2.340cm
are significantly different!. The 2.34cm
would lie between 2.335 and 2.344, and
2.340 would lie between 2.3395 and
2.3404!
Roundoff Rules:
1. If number to the right is half of that
place value, round UP to next digit, e.g.
23.472 is 23.5 (nearest tenth)
2. If number to the right is half of that
place value, truncate to that place value,
e.g. 23.414 to nearest tenth is 23.4
Variation
2 types: (a) Specialcause, (b) common
cause
Individual measurements are different,
but when grouped together, they form a
predictable pattern called distribution
Every distribution has measurable
characteristics such as:
Location  Position of middle value (or
average value)
Spread width of distribution curve
Shape the way measurements stack
up
Special cause variation
(a) Special cause variation (or assignablecause variation)
unpredictable variation that do not
normally occur due to worn parts,
improper allignment, etc
 A derived variation
 Can be eliminated by local action on a
particular segment of the process
 Local action can handle ~15% of
process problem
Common cause variation
(b) Common cause variation
 It is inherent (builtin) in the process,
not a derived variation
 Approximately 85% of process
problems are due to common cause
variation
 Require process changes to remove
the builtin variation decision by
management
Variation (within or between subgroup)
Under ideal condition, e.g. for a
manufacturing process, only common
cause variation occur within subgroup
(batch), and special cause variation
occurs between subgroup.
Special cause should occur between
batch, not within a batch.
Care must be taken so that differences in
operators, machines, batches of raw
materials in production lines do not show
up within subgroups.
Do different control charts for different
operators if operators make a difference
Effect of Variation
(a) When Special cause variation is present unpredictable
distribution
Target
Day 1
Day 2
Day 3
(b) When Special cause variation is eliminated leaving only common
cause variation predictable (process is in statistical control)
Target
Day 1
Day 2
Day 3
Histogram
Graphic representation of the frequencies of observed values,
usually plotted using rectangles.
Vertical axis is the frequency, horizontal axis is the category.
Frequency
Interval, i
Midpoint
Upper boundary
Category
Lower boundary
Steps to construct a histogram
Collect data and construct a tally sheet
Determine the range, R = Xh Xl
Determine Cell interval, i = R / (1+3.322 log n )
Determine cell midpoint, using the lowest data value, MPl = Xl+
(0.5i)
Determine the cell boundary for either upper or lower boundary
Plot the histogram
Tally sheet
Sample
Tabulation
Frequency
1
2
3
4
III
IIII
II
I
3
4
2
1
What is Statistics?
Statistics is the science of data handling
Data types: (a) Variable data quality characteristics that are
measureable and normally continuous (may take on any values);
(b) Attribute data quality characteristics that are either present or not
present, conforming or nonconforming, countable, normally discrete
values (integers)
Its applications normally involve using sample information to make
decision about a population of measurement
A population is the set of all possible data values of interest, while a
sample is only a subset of a part of the population
Sample 1
Sample 2
Population
Sample 3
Four steps in application of statistics: (a) collection of data; (b)
Organization of data; (c) Analysis of data; (d) Interpretation of data
Descriptive vs Inductive Statistics
Descriptive or deductive statistics
attempts to describe and analyze a
subject or group
Inductive statistics is trying to
determine from a limited amount of data
(sample) an important conclusion about a
much larger amount of data (population).
Since the conclusions or inferences
cannot be made with absolute certainty,
the language of probability is often used
Descriptive Statistics
Measure of central tendency describes the center position of
the data (mean, median, mode)
Measure of dispersion describe the spread of data (range,
variance, standard deviation)
Mean, X
1 N
i 1 X i where Xi is one observation, N is number of sample
N
Median is the middle point of a data series (observation in the middle of sorted
data
Mode the most frequently occuring value
100 91 85 84 75 72 72 69 65
Mean = 79.22
Mode
Median
Descriptive Statistics
Measure of dispersion (range, variance and standard deviation)
The range is calculated by taking the maximum value and
subtracting the minimum value.
Variance is the squared of the summation of the difference between
each value and the mean divided by number of samples
n
( xi
i 1
= population mean
1, 3, 5, 7, 9, 11
Range = 111 = 10
Std deviation is the squareroot of variance. Measures
spreading tendency of the data
n
(
x
i 1
If is small, high probability of
getting the values close to mean
value
If is large, high probability of getting
the values away from mean value
Descriptive Statistics
Other measure of dispersion (skewness, kurtosis,
coefficient of variation)
Skewness  lack of symmetry of data distribution. A negative values
indicate skewed to the left, positive indicates skewed to the right.
<0 left
0 = symmetrical >0 right
fi ( xi x) / n
a3 i 1
Note: S = = std dev
See examples 4.6, p147 & 4.7, p149
Descriptive Statistics
Measure of Dispersion  Kurtosis
Kurtosis Measure the peakness of the data. It is a dimensionless
value. The value must be compared to a normal distribution to determined
if it more peaked or flatter peaked distribution.
a4
f i ( xi X) / n
4
i 1
Note: S = = std dev
Leptokurtic (more peaked)
See example 4.8, p151
s4
Mesokurtic (normal)
Platykurtic (flatter)
Descriptive Statistics
Measure of Dispersion Coefficient of variation
CV measure how much variation exist relative to the mean. Unit in %.
100%
CV
X
See example 4.9, p152
Normal Distribution (Gaussian distribution)
Always symmetrical, unimodal, bellshaped distribution
with mean, median, mod having the same value
Much variation in nature and in industry follow the
normal distribution curves.
Offer good description of variations occuring in most
quality characteristics in industry it becomes the basis
of many techniques
Xi
x scale
3
2
z scale
3
2

1
+2
+3
+1
+2
+3
Population, sample, reading (notations used)
Sample 1
Sample 2
Population
Sample 3
Xbar = Average value for a sample (which has a few
readings)
s = standard deviation of a sample
(Greek letter mu) = Mean (also equivalent to Average)
value for a population (which has a few samples and
readings)
(Greek letter sigma) = standard deviation for a
population
n = number of readings from a sample
N = number of samples/groups in a population
Use of Statistics in Quality
Changing data into
information
Statistical Process Control (SPCs)
Historical Background
Walter Shewhart suggested that every process
exhibits some degree of variation and therefore is
expected.
identified two types of variation (chance cause) and
(assignable cause)
proposed first control chart to separate these two types of
variation.
SPC was applied during World War II to ensure
interchangeability of parts for weapons/ equipment.
Resurgence of SPC in the 1980s in response to
Japanese manufacturing success.
Product Control And Process Control Philosophy
The product control view:
measures quality of a product in terms of its acceptability as measured
by conformance to engineering specifications.
emphasizes detection and containment of defective material through
inspection/screening, therefore making
quality and productivity opposing rather than supportive forces.
The process control view:
emphasizes the prevention of defective material from being made in the
first place by seeking the root cause of the problem and eliminating it
altogether.
makes quality and productivity enhancement possible simultaneously
by continually seeking ways to reduce
variation, thereby eliminating waste and inefficiency in the process and
variation in performance of the product.
Product Control And Process Control Philosophy
The product control view:
measures quality of a product in terms of its acceptability as measured
by conformance to engineering specifications.
emphasizes detection and containment of defective material through
inspection/screening, therefore making
quality and productivity opposing rather than supportive forces.
The process control view:
emphasizes the prevention of defective material from being made in the
first place by seeking the root cause of the problem and eliminating it
altogether.
makes quality and productivity enhancement possible simultaneously
by continually seeking ways to reduce
variation, thereby eliminating waste and inefficiency in the process and
variation in performance of the product.
Mean ,standard deviation
Example 1
The scores on a test given to students in a
large class are normally distributed with a
mean of 57 and a standard deviation of 10.
The passing score for the exam is 30. If a
student is randomly selected, what is the
probability that he or she passed the exam?
A score of 75 or greater is needed to obtain
an A on the exam. What percentage of the
students received an A?
Mean ,standard deviation
= 57 and X = 10, and process represented is normal
distribution.
Given: X
To calculate the probability of passed, we need to find P (X30). Use z transformation to determine
the values of Z associated with the values of X
z = (30  57) / 10 = 2.7
Therefore, we need to find P (Z 2.7). Using Table A.1 in the Appendix,
P (Z 2.7) = 1  P (Z 2.7)
= 1  0.0035 = 0.9965
Therefore, probability of student passing is 99.65 %
To calculate the probability of randomly selected student has score A,
Need to find P (X 75). Use Z transformation to determine the values of Z associated with the values
of X.
z = (75  57) / 10 = 1.8
Therefore, we need to find P (Z 1.8). Using Table A.1 in the Appendix,
P (Z 1.8) = 1  P (Z 1.8)
= 1 0.9641
= 0.0359
Therefore, probability of student scoring an A is 3.59 %
The accompanying table represent the weight in gram of moulded instrument
display panels. The samples were collected at half hour intervals . Prepare a
tally sheet, of the individual measurements and then prepare a frequency
histogram of the data, clearly labelling the cell boundaries. Comment on the
shape of the distribution.
Sample
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
X1
14
20
14
15
9
19
16
14
14
15
18
20
1
12
18
19
14
14
18
15
X2
15
18
17
16
17
15
13
17
14
13
18
12
8
14
17
17
13
17
15
9
X3
13
14
14
11
18
14
14
9
12
17
16
13
9
16
12
16
15
12
16
12
X4
14
17
11
18
13
15
13
16
13
14
15
17
12
14
19
16
16
16
15
13
X5
13
8
14
14
12
16
17
15
13
16
11
14
7
20
18
17
18
11
12
20
Answer
To create a frequency histogram is to select the
number of cells and the cell boundaries. Since
the data are integer values, the cell boundaries
can be set at xx.5, xx+ cell width + 0.5, etc. The
range of the data is 20 7=13 so using 13 cells
might work well. Thus, the integer values are
the cell midpoints and the cell boundaries are
set at 6.5, 7.5, 8.5, ... , 20.5. For example, the
boundaries for the cell with midpoint 13 are
12.5 and 13.5.
The next step is to make a tally of the frequency of occurrence of the
data appear within each cell, as follows:
Cell Midpoint
Frequency per Cell
frequency, fi
7
1
8
11
2
9
1111
4
10
0
11
11111
5
12
111111111
9
13
11111111111
11
13.5  14.0  14.5
111111111111111111 18
15
11111111111
11
16
111111111111
12
17
11111111111
11
18
111111111
9
19
111
3
20
1111
4
Total, fi = 100
the cell interval, i= R/1+3.322 log n
cell midpoint MPl= Xl+i/2 (normally odd
value)
cell boundaries = 0.1i =0.4 (lower
boundary), 3.5+i (upper boundary)
frequencies.= from tally sheet
Next, this information is used to plot the
histogram:
8 12 16 20
Frequency
4
8
12
16
Weight in gram
From the shape of the histogram, it appears
that the data came from a process that
might be considered as a
candidate for representation by a normal
distribution
(excel form)
What is Statistics?
What is it Not
Has Something to Do With Data.
Objectives of Data Collection
Understanding, insights, illumination
An Inexact Science Given Industrial
Realities
Probabilities in Manufacturing
Examples with objectives
classifying parts as being defective or nondefective  reducing number of defectives
studying the number of monthly orders received
better adjusting inventory levels to match orders
measuring gas output when acid concentrations
are changed better predicting and controlling gas
levels
Statistical Thinking & Modelling
Engineers Think
Deterministically
Deterministic Models Do
Not Explain Variability
Deterministic Models Do Not
Account for Variability
Engineering
Education/Practice Blames
Factors That Remain a Mystery
Limitations in Measurement
Process
Engineering Method
Depends on Data
Real Data Exhibits
Variability
Obscures Ability to Make
Sound Decisions
Engineers Must Learn
to Think Statistically
Understanding of Risk
and Uncertainty
Key is Discovering
Sources of Variability
Data Collection
What is the fundamental purpose?
What important questions need answers?
What is the characteristic of interest?
How will it be measured? Issues
What is known about the measurement process?
How does engineering model impact data
collection?
What data does the model require?
How robust is the model to data error?
How do model parameters support problem
solution?
Are there physical constraints that impede ability to
collect data?
Statistic types
Deductive statistics describe a complete data
set
Inductive statistics deal with a limited amount
of data
Types of data
Variables data  quality characteristics that are
measurable values.
Measurable and normally continuous; may take on
any value.
Attribute data  quality characteristics that are
observed to be either present or absent,
conforming or nonconforming.
Countable and normally discrete; integer
Within vs. between subgroup
variation
Under ideal conditions: only common
cause variation occurs within subgroups
and special cause variation occurs among
between subgroups
Special Causes Should Occur
Between Batches not Within
Care must be taken so that differences in
operators, machines, lots of raw materials,
production lines do not show up within
subgroups.
Do different control charts for different
operators if operators make a difference.
Descriptive statistics
Measures of Central Tendency
Describes the center position of the data
Mean Median Mode
Measures of Dispersion
Describes the spread of the data
Range Variance Standard deviation
Measures of central
tendency: Mean
Arithmetic mean x =
1
N
i 1
where xi is one observation, means add up
what follows and N is the number of
observations
So, for example, if the data are : 0,2,5,9,12 the
mean is (0+2+5+9+12)/5 = 28/5 = 5.6
Measures of central
tendency: Median  mode
Median = the observation in the middle of
sorted data
Mode = the most frequently occurring value
Median and mode
100 91 85 84 75 72 72 69 65
Mode
Median
Mean = 79.22
Measures of dispersion:
range
The range is calculated by taking the maximum
value and subtracting the minimum value.
2 , 4 ,6 ,8 ,10, 12 , 14
Range = 14  2 = 12
Measures of dispersion:
variance
Calculate the deviation from the mean for
every observation.
Square each deviation
Add them up and divide by the number of
observations
n
( xi
i 1
Measures of dispersion:
standard deviation
The standard deviation is the square root of
the variance. The variance is in square units
so the standard deviation is in the same units
as x.
( xi
i 1
Standard deviation and
curve shape
If is small, there is a high probability for
getting a value close to the mean.
If is large, there is a correspondingly higher
probability for getting values further away from
the mean.
Chebyshevs theorem
If a probability distribution has the mean and
the standard deviation , the probability of
obtaining a value which deviates from the
mean by at least k standard deviations is at
most 1/k2.
P ( x k
k
As a result
Probability of obtaining a value beyond x
standard deviations is at most::
2 standard deviations
1/22 = 1/4 = 0.25
3 standard deviations
1/32 = 1/9 = 0.11
4 standard deviations
1/42 = 1/16 = 0.0625
Other measures of
dispersion: skewness
When a distribution lacks symmetry, it is
considered skewed.
<0 left
0 = symmetrical >0 right
n
a3
f i ( xi x) / n
3
i 1
Other measures of
dispersion: kurtosis
suggests peakness of the data
a can be used to compare distributions
n
a4
f i ( xi X) / n
4
i 1
The normal frequency
distribution
1 ( x ) / 2
f ( x)
e
2
2
The normal curve
A normal curve is symmetrical about
The mean, mode, and median are equal
The curve is unimodal and bellshaped
Data values concentrate around the mean
Area under the normal curve equals 1
The normal curve
If x follows a bellshaped (normal) distribution,
then the probability that x is within
1 standard deviation of the mean is 68%
2 standard deviations of the mean is 95 %
3 standard deviations of the mean is 99.7%
The standardized normal
=0
=1
x scale 3
z scale
3
2
2
1
+2
+3
+1
+2
+3
Test Statistic and Decision Rule
Critical Region, Critical Value,
and Significance Level
Type I Error
A Type I error is the decision error when the
researcher incorrectly rejects the null hypothesis
(when the null is true).
The probability of that error is a..
a. is the probability that the test statistic lies in them
critical region when the null hypothesis is true.
When the null is rejected, we say that the test is
statistically significant at a 100 a % significance
level.
The pValue
A pvalue is the lowest level (of
significance) at which the observed value
of test statistic is significant.
The pvalue gives researcher an
alternative to merely rejecting or not
rejecting the null.
A small pvalue clearly refutes Ho
Summary For Hypothesis Testing
State the null hypothesis H0: q = q0
Choose an appropriate alternate hypothesis Ha: q < q0,
q > q0, or q ,q0,
Chose a significant level of size a
Select the appropriate test statistic and critical region (if the
decision is based on a pvalue, the critical region is not
necessary) and state the decision rule in terms of the test
statistic
Compute the value of test statistic from the sample data
Reject H0 based on the decision rule (if the test statistic is in the
critical region or if the pvalue is less than a): otherwise do not
reject H0