You are on page 1of 85

Lecture 3: Measured data and Statistics

Introduces the concept of measurement


variation and statistical methods for
measuring and describing variation
Compiled by Ramdziah Md.Nasir

Recap Lecture 2: TQM


The Road to Business Growth

Business Growth
Continuous Improvement
Supplier Partnering
Employee Involvement
Focus on Quality
Leadership

Process Orientation

Customer Satisfaction
Clear Vision

TQM Implementation
Begins with the Senior Managements and the CEOs
commitment
Involvement is required
Requires the education of Senior Management in TQM
concepts

Timing of the implementation process can be very


important
Formation of the Quality Council

Development of Core Values, Vision Statement,


Mission Statement, Quality Policy Statement
3

TQM Implementation
Quality Council:

Composed of: CEO, the Senior


Managers of the functional areas, such
as design, marketing, finance,
production, and quality; and a
coordinator or consultant
The coordinator will ensure that the team
members are empowered and know their
responsibilities

TQM Implementation
Quality Council Duties:
1.

Develop the core values, vision,


mission, and quality policy statements

2.

Develop the strategic long-term plan


with goals and the annual quality
improvement program with objectives

3.

Create the total education and training


plan

4.

Determine and continually monitor the


cost of poor quality

TQM Implementation
Quality Council Duties:
5.

Determine the performance measures for


the organization

6.

Determine projects that improve the


processes

7.

Establish multifunctional project and


departmental or work group team

8.

Establish or revise the recognition and


reward system

TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award:
Visionary Leadership
Customer-driven Excellence
Organizational & Personal Learning
Valuing Employees & Partners

TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award Contd:
Agility
Focus on the Future
Management for Innovation
Management by Fact

TQM Implementation
Core Values for the Malcolm Baldrige
National Quality Award Contd:
Social Responsibility
Focus on Results and Creating Value
Systems Perspective

TQM Implementation
Quality Statements:

Include the Vision Statement, Mission


Statement, and Quality Policy
Statement
They are part of the strategy planning
process, which includes goals and
objectives
Develop with input from all personnel
10

TQM Implementation
Seven Steps to Strategy Planning:

Customer Needs
Customer Positioning
Predict the Future
Gap Analysis
Closing the Gap
Alignment
Implementation

11

Continuous Process
Improvement/Development (CPD)
FEEDBACK

INPUT
Materials
Money
Information
Data, etc

PROCESS

People
Equipment
Method
Procedures
Environment
Materials

OUTPUT
Information
Data
Product
Service, etc.

CONDITIONS
Figure 2-3 Input/output process model

OUTCOMES

Taguchis Loss Function


L = D2C

High loss

where
L = loss to society
D = distance from
target value
C = cost of deviation

Unacceptable

Loss (to
producing
organization,
customer,
and society)

Poor
Good
Best

Low loss

Target-oriented quality
yields more product in
the best category
Target-oriented quality
brings product toward
the target value

Frequency

Conformance-oriented
quality keeps products
within 3 standard
deviations
Lower

Target
Specification

Upper

Learning Objectives

Understand the errors related to measurement


Know the round-off rules
Able to distinguish between two types of variations
special cause and common cause
Know what statistic is and its applications
Know what distributions are and how they are used in
SPC
Able to calculate the mean, median, mode, range and
standard deviation for a set of numbers
Able to draw a histogram for a set of numbers

Measurement

Any measurement is only as good as the


measuring device/technique or the persons
using it.
Measurement error always exist, measured
value is an estimation.
Accuracy is the smallest unit on the measuring
device
Maximum error of a measurement is half the
accuracy

Distribution is an ordered set of numbers that are


grouped in some manners. The distribution
maybe in a table, graph or picture form

Example:

2.34cm is accurate to the nearest


hundredth of a cm (0.01), therefore
maximum error is 0.01/2 =0.005
When a measurement is written, its
accuracy is implied by the number of
place value, e.g. 2.34cm and 2.340cm
are significantly different!. The 2.34cm
would lie between 2.335 and 2.344, and
2.340 would lie between 2.3395 and
2.3404!

Round-off Rules:
1. If number to the right is half of that
place value, round UP to next digit, e.g.
23.472 is 23.5 (nearest tenth)
2. If number to the right is half of that
place value, truncate to that place value,
e.g. 23.414 to nearest tenth is 23.4

Variation

2 types:- (a) Special-cause, (b) common


cause
Individual measurements are different,
but when grouped together, they form a
predictable pattern called distribution
Every distribution has measurable
characteristics such as:
Location - Position of middle value (or
average value)
Spread width of distribution curve
Shape the way measurements stack
up

Special cause variation


(a) Special cause variation (or assignablecause variation)
unpredictable variation that do not
normally occur due to worn parts,
improper allignment, etc
- A derived variation
- Can be eliminated by local action on a
particular segment of the process
- Local action can handle ~15% of
process problem

Common cause variation


(b) Common cause variation
- It is inherent (built-in) in the process,
not a derived variation
- Approximately 85% of process
problems are due to common cause
variation
- Require process changes to remove
the built-in variation decision by
management

Variation (within or between subgroup)

Under ideal condition, e.g. for a


manufacturing process, only common
cause variation occur within subgroup
(batch), and special cause variation
occurs between subgroup.
Special cause should occur between
batch, not within a batch.
Care must be taken so that differences in
operators, machines, batches of raw
materials in production lines do not show
up within subgroups.
Do different control charts for different
operators if operators make a difference

Effect of Variation
(a) When Special cause variation is present unpredictable
distribution

Target

Day 1

Day 2

Day 3

(b) When Special cause variation is eliminated leaving only common


cause variation predictable (process is in statistical control)

Target

Day 1

Day 2

Day 3

Histogram

Graphic representation of the frequencies of observed values,


usually plotted using rectangles.
Vertical axis is the frequency, horizontal axis is the category.

Frequency

Interval, i

Mid-point
Upper boundary

Category

Lower boundary

Steps to construct a histogram

Collect data and construct a tally sheet


Determine the range, R = Xh Xl
Determine Cell interval, i = R / (1+3.322 log n )
Determine cell mid-point, using the lowest data value, MPl = Xl+
(0.5i)
Determine the cell boundary for either upper or lower boundary
Plot the histogram
Tally sheet

Sample

Tabulation

Frequency

1
2
3
4

III
IIII
II
I

3
4
2
1

What is Statistics?

Statistics is the science of data handling


Data types: (a) Variable data quality characteristics that are
measureable and normally continuous (may take on any values);
(b) Attribute data quality characteristics that are either present or not
present, conforming or non-conforming, countable, normally discrete
values (integers)
Its applications normally involve using sample information to make
decision about a population of measurement
A population is the set of all possible data values of interest, while a
sample is only a subset of a part of the population

Sample 1
Sample 2

Population

Sample 3

Four steps in application of statistics: (a) collection of data; (b)


Organization of data; (c) Analysis of data; (d) Interpretation of data

Descriptive vs Inductive Statistics

Descriptive or deductive statistics


attempts to describe and analyze a
subject or group
Inductive statistics is trying to
determine from a limited amount of data
(sample) an important conclusion about a
much larger amount of data (population).
Since the conclusions or inferences
cannot be made with absolute certainty,
the language of probability is often used

Descriptive Statistics

Measure of central tendency- describes the center position of


the data (mean, median, mode)

Measure of dispersion describe the spread of data (range,


variance, standard deviation)

Mean, X

1 N
i 1 X i where Xi is one observation, N is number of sample
N

Median is the middle point of a data series (observation in the middle of sorted
data
Mode the most frequently occuring value

100 91 85 84 75 72 72 69 65
Mean = 79.22

Mode
Median

Descriptive Statistics
Measure of dispersion (range, variance and standard deviation)

The range is calculated by taking the maximum value and


subtracting the minimum value.
Variance is the squared of the summation of the difference between
each value and the mean divided by number of samples
n

( xi

i 1

= population mean

1, 3, 5, 7, 9, 11
Range = 11-1 = 10

Std deviation is the square-root of variance. Measures


spreading tendency of the data
n

(
x

i 1

If is small, high probability of


getting the values close to mean
value
If is large, high probability of getting
the values away from mean value

Descriptive Statistics
Other measure of dispersion (skewness, kurtosis,
coefficient of variation)

Skewness - lack of symmetry of data distribution. A negative values


indicate skewed to the left, positive indicates skewed to the right.
<0 left
0 = symmetrical >0 right

fi ( xi x) / n

a3 i 1

Note: S = = std dev

See examples 4.6, p147 & 4.7, p149

Descriptive Statistics
Measure of Dispersion - Kurtosis

Kurtosis Measure the peakness of the data. It is a dimensionless


value. The value must be compared to a normal distribution to determined
if it more peaked or flatter peaked distribution.

a4

f i ( xi X) / n
4

i 1

Note: S = = std dev

Leptokurtic (more peaked)

See example 4.8, p151

s4

Mesokurtic (normal)

Platykurtic (flatter)

Descriptive Statistics
Measure of Dispersion Coefficient of variation

CV measure how much variation exist relative to the mean. Unit in %.

100%
CV
X

See example 4.9, p152

Normal Distribution (Gaussian distribution)

Always symmetrical, unimodal, bell-shaped distribution


with mean, median, mod having the same value
Much variation in nature and in industry follow the
normal distribution curves.
Offer good description of variations occuring in most
quality characteristics in industry it becomes the basis
of many techniques

Xi

x scale

-3

-2

z scale

-3

-2

-
-1

+2

+3

+1

+2

+3

Population, sample, reading (notations used)


Sample 1
Sample 2

Population

Sample 3

X-bar = Average value for a sample (which has a few


readings)

s = standard deviation of a sample


(Greek letter mu) = Mean (also equivalent to Average)
value for a population (which has a few samples and
readings)
(Greek letter sigma) = standard deviation for a
population
n = number of readings from a sample
N = number of samples/groups in a population

Use of Statistics in Quality


Changing data into
information

Statistical Process Control (SPCs)


Historical Background

Walter Shewhart suggested that every process


exhibits some degree of variation and therefore is
expected.

identified two types of variation (chance cause) and


(assignable cause)
proposed first control chart to separate these two types of
variation.

SPC was applied during World War II to ensure


interchangeability of parts for weapons/ equipment.
Resurgence of SPC in the 1980s in response to
Japanese manufacturing success.

Product Control And Process Control Philosophy


The product control view:
measures quality of a product in terms of its acceptability as measured
by conformance to engineering specifications.
emphasizes detection and containment of defective material through
inspection/screening, therefore making
quality and productivity opposing rather than supportive forces.

The process control view:


emphasizes the prevention of defective material from being made in the
first place by seeking the root cause of the problem and eliminating it
altogether.

makes quality and productivity enhancement possible simultaneously


by continually seeking ways to reduce
variation, thereby eliminating waste and inefficiency in the process and
variation in performance of the product.

Product Control And Process Control Philosophy


The product control view:
measures quality of a product in terms of its acceptability as measured
by conformance to engineering specifications.
emphasizes detection and containment of defective material through
inspection/screening, therefore making
quality and productivity opposing rather than supportive forces.

The process control view:


emphasizes the prevention of defective material from being made in the
first place by seeking the root cause of the problem and eliminating it
altogether.

makes quality and productivity enhancement possible simultaneously


by continually seeking ways to reduce
variation, thereby eliminating waste and inefficiency in the process and
variation in performance of the product.

Mean ,standard deviation


Example 1
The scores on a test given to students in a
large class are normally distributed with a
mean of 57 and a standard deviation of 10.
The passing score for the exam is 30. If a
student is randomly selected, what is the
probability that he or she passed the exam?
A score of 75 or greater is needed to obtain
an A on the exam. What percentage of the
students received an A?

Mean ,standard deviation


= 57 and X = 10, and process represented is normal
distribution.

Given: X

To calculate the probability of passed, we need to find P (X30). Use z transformation to determine
the values of Z associated with the values of X

z = (30 - 57) / 10 = -2.7


Therefore, we need to find P (Z -2.7). Using Table A.1 in the Appendix,

P (Z -2.7) = 1 - P (Z -2.7)
= 1 - 0.0035 = 0.9965
Therefore, probability of student passing is 99.65 %
To calculate the probability of randomly selected student has score A,
Need to find P (X 75). Use Z transformation to determine the values of Z associated with the values
of X.

z = (75 - 57) / 10 = 1.8


Therefore, we need to find P (Z 1.8). Using Table A.1 in the Appendix,
P (Z 1.8) = 1 - P (Z 1.8)

= 1 0.9641

= 0.0359

Therefore, probability of student scoring an A is 3.59 %

The accompanying table represent the weight in gram of moulded instrument


display panels. The samples were collected at half hour intervals . Prepare a
tally sheet, of the individual measurements and then prepare a frequency
histogram of the data, clearly labelling the cell boundaries. Comment on the
shape of the distribution.
Sample
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

X1
14
20
14
15
9
19
16
14
14
15
18
20
1
12
18
19
14
14
18
15

X2
15
18
17
16
17
15
13
17
14
13
18
12
8
14
17
17
13
17
15
9

X3
13
14
14
11
18
14
14
9
12
17
16
13
9
16
12
16
15
12
16
12

X4
14
17
11
18
13
15
13
16
13
14
15
17
12
14
19
16
16
16
15
13

X5
13
8
14
14
12
16
17
15
13
16
11
14
7
20
18
17
18
11
12
20

Answer
To create a frequency histogram is to select the
number of cells and the cell boundaries. Since
the data are integer values, the cell boundaries
can be set at xx.5, xx+ cell width + 0.5, etc. The
range of the data is 20- 7=13 so using 13 cells
might work well. Thus, the integer values are
the cell midpoints and the cell boundaries are
set at 6.5, 7.5, 8.5, ... , 20.5. For example, the
boundaries for the cell with midpoint 13 are
12.5 and 13.5.

The next step is to make a tally of the frequency of occurrence of the


data appear within each cell, as follows:
Cell Midpoint
Frequency per Cell
frequency, fi
7
1
8
11
2
9
1111
4
10
0
11
11111
5
12
111111111
9
13
11111111111
11
13.5 - 14.0 - 14.5
111111111111111111 18
15
11111111111
11
16
111111111111
12
17
11111111111
11
18
111111111
9
19
111
3
20
1111
4
Total, fi = 100

the cell interval, i= R/1+3.322 log n


cell midpoint MPl= Xl+i/2 (normally odd
value)
cell boundaries = 0.1-i =-0.4 (lower
boundary), 3.5+i (upper boundary)
frequencies.= from tally sheet

Next, this information is used to plot the


histogram:
8 12 16 20
Frequency
4
8
12
16
Weight in gram
From the shape of the histogram, it appears
that the data came from a process that
might be considered as a
candidate for representation by a normal
distribution
(excel form)

What is Statistics?
What is it Not

Has Something to Do With Data.


Objectives of Data Collection
Understanding, insights, illumination
An Inexact Science Given Industrial
Realities

Probabilities in Manufacturing
Examples with objectives
classifying parts as being defective or nondefective -- reducing number of defectives
studying the number of monthly orders received
better adjusting inventory levels to match orders
measuring gas output when acid concentrations
are changed --better predicting and controlling gas
levels

Statistical Thinking & Modelling


Engineers Think
Deterministically
Deterministic Models Do
Not Explain Variability
Deterministic Models Do Not
Account for Variability

Engineering
Education/Practice Blames
Factors That Remain a Mystery
Limitations in Measurement
Process

Engineering Method
Depends on Data
Real Data Exhibits
Variability
Obscures Ability to Make
Sound Decisions

Engineers Must Learn


to Think Statistically
Understanding of Risk
and Uncertainty
Key is Discovering
Sources of Variability

Data Collection
What is the fundamental purpose?
What important questions need answers?

What is the characteristic of interest?


How will it be measured? Issues
What is known about the measurement process?

How does engineering model impact data


collection?
What data does the model require?
How robust is the model to data error?

How do model parameters support problem


solution?
Are there physical constraints that impede ability to
collect data?

Statistic types

Deductive statistics describe a complete data


set
Inductive statistics deal with a limited amount
of data

Types of data

Variables data - quality characteristics that are


measurable values.

Measurable and normally continuous; may take on


any value.

Attribute data - quality characteristics that are


observed to be either present or absent,
conforming or nonconforming.

Countable and normally discrete; integer

Within vs. between subgroup


variation
Under ideal conditions: only common
cause variation occurs within subgroups
and special cause variation occurs among
between subgroups

Special Causes Should Occur


Between Batches not Within
Care must be taken so that differences in
operators, machines, lots of raw materials,
production lines do not show up within
subgroups.
Do different control charts for different
operators if operators make a difference.

Descriptive statistics

Measures of Central Tendency

Describes the center position of the data


Mean Median Mode

Measures of Dispersion

Describes the spread of the data


Range Variance Standard deviation

Measures of central
tendency: Mean
Arithmetic mean x =

1
N

i 1

where xi is one observation, means add up


what follows and N is the number of
observations
So, for example, if the data are : 0,2,5,9,12 the
mean is (0+2+5+9+12)/5 = 28/5 = 5.6

Measures of central
tendency: Median - mode

Median = the observation in the middle of


sorted data
Mode = the most frequently occurring value

Median and mode


100 91 85 84 75 72 72 69 65
Mode
Median
Mean = 79.22

Measures of dispersion:
range

The range is calculated by taking the maximum


value and subtracting the minimum value.
2 , 4 ,6 ,8 ,10, 12 , 14

Range = 14 - 2 = 12

Measures of dispersion:
variance

Calculate the deviation from the mean for


every observation.
Square each deviation
Add them up and divide by the number of
observations
n

( xi

i 1

Measures of dispersion:
standard deviation

The standard deviation is the square root of


the variance. The variance is in square units
so the standard deviation is in the same units
as x.

( xi

i 1

Standard deviation and


curve shape

If is small, there is a high probability for


getting a value close to the mean.
If is large, there is a correspondingly higher
probability for getting values further away from
the mean.

Chebyshevs theorem

If a probability distribution has the mean and


the standard deviation , the probability of
obtaining a value which deviates from the
mean by at least k standard deviations is at
most 1/k2.

P ( x k
k

As a result
Probability of obtaining a value beyond x
standard deviations is at most::
2 standard deviations
1/22 = 1/4 = 0.25

3 standard deviations
1/32 = 1/9 = 0.11

4 standard deviations

1/42 = 1/16 = 0.0625

Other measures of
dispersion: skewness

When a distribution lacks symmetry, it is


considered skewed.
<0 left
0 = symmetrical >0 right
n

a3

f i ( xi x) / n
3

i 1

Other measures of
dispersion: kurtosis

suggests peak-ness of the data

a can be used to compare distributions


n

a4

f i ( xi X) / n
4

i 1

The normal frequency


distribution

1 ( x ) / 2
f ( x)
e
2
2

The normal curve

A normal curve is symmetrical about


The mean, mode, and median are equal
The curve is uni-modal and bell-shaped
Data values concentrate around the mean
Area under the normal curve equals 1

The normal curve

If x follows a bell-shaped (normal) distribution,


then the probability that x is within

1 standard deviation of the mean is 68%


2 standard deviations of the mean is 95 %
3 standard deviations of the mean is 99.7%

The standardized normal


=0
=1
x scale -3
z scale

-3

-2

-2

-1

+2

+3

+1

+2

+3

Test Statistic and Decision Rule

Critical Region, Critical Value,


and Significance Level

Type I Error
A Type I error is the decision error when the
researcher incorrectly rejects the null hypothesis
(when the null is true).
The probability of that error is a..
a. is the probability that the test statistic lies in them
critical region when the null hypothesis is true.
When the null is rejected, we say that the test is
statistically significant at a 100 a % significance
level.

The p-Value
A p-value is the lowest level (of
significance) at which the observed value
of test statistic is significant.
The p-value gives researcher an
alternative to merely rejecting or not
rejecting the null.
A small p-value clearly refutes Ho

Summary For Hypothesis Testing


State the null hypothesis H0: q = q0
Choose an appropriate alternate hypothesis Ha: q < q0,
q > q0, or q ,q0,
Chose a significant level of size a
Select the appropriate test statistic and critical region (if the
decision is based on a p-value, the critical region is not
necessary) and state the decision rule in terms of the test
statistic
Compute the value of test statistic from the sample data
Reject H0 based on the decision rule (if the test statistic is in the
critical region or if the p-value is less than a): otherwise do not
reject H0