You are on page 1of 35

Chapter 7 – Measurement System Analysis

7.1 Introduction

Measurement System Analysis (MSA) is the first step of the measure phase along the
DMAIC pathway to improvement. You will be basing the success of your improvement
project on key performance indicators that are tied to your measurement system. Consequently,
before you begin tracking metrics you will need to complete a MSA to validate the measurement
system. A comprehensive MSA typically consists of six parts; Instrument Detection Limit,
Method Detection Limit, Accuracy, Linearity, Gage R&R and Long Term Stability. If you want
to expand measurement capacity or qualify another instrument you must expand the MSA to
include Metrology Correlation and Matching.
A poor measurement system can make data meaningless and process improvement
impossible. Large measurement error will prevent assessment of process stability and capability,
confound Root Cause Analysis and hamper continuous improvement activities in manufacturing
operations. Measurement error has a direct impact on assessing the stability and capability of a
process. Poor metrology can make a stable process appear unstable and make a capable process
appear incapable. Measurement System Analysis quantifies the effect of measurement error on
the total variation of a unit operation. The sources of this variation may be visualized as in
Figure 7.1 and the elements of a measurement system as in Figure 7.2.

Observed Process 
Variation

Actual Process  Measurement 
Variation Variation

Long‐term  Short‐term  Variation within  Variation due to  Variation due to 


Process Variation Process Variation Sample Gage Operators

Repeatability Calibration Stability Linearity

Figure 7.1 Sources of Variation

Operators are often skeptical of measurement systems, especially those that provide them
with false feedback causing them to “over-steer” their process. This skepticism is well founded
since many measurement systems are not capable of accurately or precisely measuring the
process. Accuracy refers to the average of individual measurements compared with the known,
true value. Precision refers to the grouping of the individual measurements - the tighter the

124
grouping, the higher the precision. The bull’s eye targets of Figure 7.3 best illustrate the
difference between accuracy and precision.

Figure 7.2 Measurement System Elements

Good Accuracy / Bad Precision Bad Accuracy / Bad Precision

Bad Accuracy / Good Precision Good Accuracy / Good Precision

Figure 7.3 Accuracy vs Precision – The Center of the Target is the Objective

125
Accuracy is influenced by resolution, bias, linearity and stability whereas precision is
influenced by repeatability and reproducibility of the measurement system. Repeatability is the
variation which occurs when the same operator repeatedly measures the same sample on the
same instrument under the same conditions. Reproducibility is the variation which occurs
between two or more instruments or operators measuring the same sample with the same
measurement method in a stable environment. The total variance in a quality characteristic of a
process is described by Eqn 7.1 and Eqn 7.2 while the percent contribution of the measurement
system to the total variance may be calculated from Eqn 7.3.

Equation 7.1 Total Variance

2total = 2product + 2measurement Eqn 7.1

where 2total = total variance


2product = variance due to product
2measurement = variance due to measurement system

Equation 7.2 Measurement System Variance

2measurement = 2repeatability + 2reproducibility Eqn 7.2

where 2repeatability = variance within operator/device combination


2reproducibility = variance between operators

Equation 7.3 % Contribution of Measurement System to Total Variance

2 repeatability + 2reproducibility
% Contribution = X 100 Eqn 7.3
2total

We want to be able to measure true variations in product quality and not variations in the
2
measurement system so it is desired to minimize  measurement . We will review the steps in a
typical measurement system analysis by way of example, first for the case of variables data and
then for the case of attribute data.

7.2 Instrument Detection Limit (IDL)

Today’s measurement devices are an order of magnitude more complex than the “gages”
for which the Automotive Industry Action Group (AIAG) first developed Gage Repeatability and

126
Reproducibility (Gage R&R) studies. Typically they are electromechanical devices with internal
microprocessors having inherent signal to noise ratios. The Instrument Detection Limit (IDL)
should be calculated from the baseline noise of the instrument. Let us examine the case where a
gas chromatograph (GC) is being used to measure the concentration of some analyte of interest.
Refer to Figure 7.4.

Figure 7.4 Gas Chromatogram

The chromatogram has a baseline with peaks at different column retention times for
hydrogen, argon, oxygen, nitrogen, methane and carbon monoxide. Let’s say we wanted to
calculate the IDL for nitrogen at retention time 5.2 min. We would purge and evacuate the
column to make sure it is clean then successively inject seven blanks of the carrier gas (helium).
The baseline noise peak at retention time 5.2 min is integrated for each of the blank injections
and converted to concentration units of Nitrogen. The standard deviation of these concentrations
is multiplied by the Student’s t statistic for n-1 degrees of freedom at a 99% confidence interval
(3.143) to calculate the IDL. This is the EPA protocol as defined in 40 CFR Part 136:
Guidelines Establishing Test Procedures for the Analysis of Pollutants, Appendix B. Refer to
Figure 7.5 below for the calculation summary.

Percentiles of the t -Distribution Injection N2


df 90.0% 95.0% 97.5% 99.0% 99.5% 99.9% No. (ppm)
1 3.078 6.314 12.706 31.821 63.657 318.309 1 0.01449
2 1.886 2.92 4.303 6.965 9.925 22.327 2 0.01453
3 1.638 2.353 3.183 4.541 5.841 10.215 3 0.01456 df = n - 1
4 1.533 2.132 2.777 3.747 4.604 7.173 4 0.01459 IDL = T(df, 1-α=0.99) * Stdev
5 1.476 2.015 2.571 3.365 4.032 5.893 5 0.01442 IDL = 3.143(0.00007044)
6 1.44 1.943 2.447 3.143 3.708 5.208 6 0.01440 = 0.0002214 ppm N2
7 1.415 1.895 2.365 2.998 3.5 4.785 7 0.01447
8 1.397 1.86 2.306 2.897 3.355 4.501 StDev 0.00007044
9 1.383 1.833 2.262 2.822 3.25 4.297 Mean 0.01449
10 1.372 1.812 2.228 2.764 3.169 4.144 RSD 0.49%
IDL 0.0002214

Figure 7.5 Instrument Detection Limit (IDL) Calculation

127
7.3 Method Detection Limit (MDL)

Method detection limit (MDL) is defined as the minimum concentration of a substance


that can be measured and reported with 99% confidence that the analyte concentration is greater
than zero as determined from analysis of a sample in a given matrix containing the analyte.
MDL is calculated in a similar way to IDL with the exception that the same sample is measured
on the instrument with n=7 trials and the sample is disconnected and reconnected to the
measurement apparatus between trials. This is called dynamic repeatability analysis. An
estimate is made of the MDL and a sample prepared at or near this MDL concentration. The
seven trials are then measured on the instrument and the MDL calculated as in Figure 7.6.
Percentiles of the t -Distribution Injection N2
df 90.0% 95.0% 97.5% 99.0% 99.5% 99.9% No. (ppm)
1 3.078 6.314 12.706 31.821 63.657 318.309 1 0.3596
2 1.886 2.92 4.303 6.965 9.925 22.327 2 0.3010
3 1.638 2.353 3.183 4.541 5.841 10.215 3 0.3227
4 0.3239
df = n - 1
4 1.533 2.132 2.777 3.747 4.604 7.173
5 1.476 2.015 2.571 3.365 4.032 5.893 5 0.3335 MDL = T(df, 1-α=0.99) * Stdev
6 1.44 1.943 2.447 3.143 3.708 5.208 6 0.3196 MDL = 3.143(0.01801)
7 1.415 1.895 2.365 2.998 3.5 4.785 7 0.3365 = 0.05660 ppm N2
8 1.397 1.86 2.306 2.897 3.355 4.501 StDev 0.01801
9 1.383 1.833 2.262 2.822 3.25 4.297 Mean 0.3281
10 1.372 1.812 2.228 2.764 3.169 4.144 RSD 5.49%
MDL 0.05660
MDL/X-bar 17.2%

Figure 7.6 Method Detection Limit (MDL) Calculation

MDL divided by the mean of the seven trials should be within 10-100%. If this is not the
case, repeat the MDL analysis with a starting sample concentration closer to the calculated MDL.

7.4 Measurement System Analysis – Variables Data

A properly conducted measurement system analysis (MSA) can yield a treasure trove of
information about your measurement system. Repeatability, reproducibility, resolution, bias, and
precision to tolerance ratio are all deliverables of the MSA and can be used to identify areas for
improvement in your measurement system. It is important to conduct the MSA in the current
state since this is your present feedback mechanism for your process. Resist the temptation to
dust off the Standard Operating Procedure and brief the operators on the correct way to measure
the parts. Resist the temptation to replace the NIST8- traceable standard, which looks like it has
been kicked around the metrology laboratory a few times.
To prepare for an MSA you must collect samples from the process that span the
specification range of the measurement in question. Include out-of-spec high samples and out-
of-spec low samples. Avoid creating samples artificially in the laboratory. There may be
complicating factors in the commercial process which influence your measurement system.
Include all Operators in the MSA who routinely measure the product. The number of samples

8
National Institute of Standards and Technology

128
times the number of Operators should be greater than or equal to fifteen, with three trials for each
sample. If this is not practical, increase the number of trials as per Figure 7.7.

Samples x Operators Trials


S x O ≥ 15 3
8 ≤ S x O < 15 4
5≤SxO<8 5
SxO<5 6

Figure 7.7 Measurement System Analysis Design

Code the samples such that the coding gives no indication to the expected measurement
value – this is called blind sample coding. Have each sample measured by an outside laboratory.
These measurements will serve as your reference values. Ask each Operator to measure each
sample three times in random sequence. Ensure that the Operators do not “compare notes”. We
will utilize Minitab to analyze the measurement system described in Case Study III.

Case Study III: Minnesota Polymer Co.


Minnesota Polymer Co. supplies a special grade of resin to ABC Molding Co. which includes a silica modifier to
improve dimensional stability. The product code is POMBLK-15 and the silica concentration specification by
weight is 15  2%. Silica concentration is determined by taking a sample of the powdered resin and pressing it into
a 4 cm disk using a 25-ton hydraulic press. The sample disk is then analyzed by x-ray fluorescence energy
dispersive spectroscopy (XRF-EDS) to measure the silica content. Manufacturing POMBLK-15 is difficult. The
silica is light and fluffy and sometimes gets stuck in the auger used to feed the mixing tank. A new process
engineer, Penelope Banks, has been hired by Minnesota Polymer. One of her first assignments is to improve
POMBLK-15 process control. SPC analysis of historical batch silica concentration results have indicated out-of-
control symptoms and poor Cpk. Before Penny makes any changes to the process she prudently decides to conduct a
measurement system analysis to find out the contribution of the measurement system to the process variation.

Minnesota Polymer is a firm believer in process ownership. The same operator who charges the raw materials, runs
the manufacturing process, collects the quality control sample, presses the sample disk and then runs the silica
analysis on the XRF-EDS instrument. The operator uses the silica concentration analysis results to adjust the silica
charge on the succeeding batch. POMBLK-15 is typically run over a five-day period in the three-shift, 24/7
operation.

Penny has collected five powder samples from POMBLK-15 process retains which span the silica specification
range and included two out-of-specification samples pulled from quarantine lots. She has asked each of the three
shift operators to randomly analyze three samples from each powder bag for silica content according to her sampling
plan. Penny has sent a portion of each sample powder to the Company’s R&D Headquarters in Hong Kong for
silica analysis. These results will serve as reference values for each sample. The following table summarizes the
silica concentration measurements and Figure 7.8 captures the screen shots of the MSA steps for Case Study III.

129
Sample Operator 1 Operator 2 Operator 3
Bag # Reference 1 Trial 1 Trial 2 Trial 3 Trial 1 Trial 2 Trial 3 Trial 1 Trial 2 Trial 3
1 17.3 18.2 17.9 18.2 18.1 18.0 18.0 17.8 17.8 18.2
2 14.0 14.4 14.9 14.8 14.8 14.6 14.8 14.4 14.4 14.5
3 13.3 14.0 13.9 13.8 13.9 14.2 14.0 13.8 13.7 13.8
4 16.7 17.2 17.2 17.4 17.4 17.3 17.5 17.4 17.5 17.5
5 12.0 12.9 12.8 12.5 12.5 12.9 12.8 12.9 12.5 12.6

1
As Reported by Hong Kong R&D Center

Figure 7.8 Measurement System Analysis Steps – Variable Data

Open a new worksheet. Click on Stat  Quality Tools  Gage Study  Create Gage R&R Study
Worksheet on the top menu.

130
Enter the Number of Operators, the Number of Replicates and the Number of Parts in the dialogue box.
Click OK.

The worksheet is modified to include a randomized run order of the samples.

131
Name the adjoining column Silica Conc and transcribe the random sample measurement data to the
relevant cells in the worksheet.

Click on Stat  Quality Tools  Gage Study  Gage R&R Study (Crossed) on the top menu.

132
Select C2 Parts for Part numbers, C3 Operators for Operators and C4 Silica Conc for Measurement data in
the dialogue box. Click the radio toggle button for ANOVA under Method of Analysis. Click Options.

Six (6) standard deviations will account for 99.73% of the Measurement System variation. Enter Lower
Spec Limit and Upper Spec Limit in the dialogue box. Click OK. Click OK.

133
Gage R&R (ANOVA) Report for Silica Conc
Reported by:
Gage name: Tolerance:
Date of study: Misc:

Components of Variation Silica Conc by Parts


% Contribution 18
300 % Study Var
% Tolerance
Percent

16

150
14

0 1 2 3 4 5
Gage R&R Repeat Reprod Part-to-Part
Parts
R Chart by Operators
1 2 3 Silica Conc by Operators
UCL=0.6693
Sample Range

18
0.50
_ 16
0.25 R=0.26

0.00 LCL=0 14
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Parts 1 2 3
Operators
Xbar Chart by Operators
1 2 3
Parts * Operators Interaction
18
Sample Mean

18 Operators
UCL=15.593
__ 1
16
2
Average
X=15.327 16 3
LCL=15.061
14
14
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
Parts 1 2 3 4 5
Parts

A new graph is created in the Minitab project file with the Gage R&R analysis results.

134
Return to the session by clicking on Window  Session on the top menu to view the ANOVA analytical
results.

Let us more closely examine the graphical output of the Gage R&R (ANOVA) Report for
Silica Conc. Figure 7.9 shows the components of variation. A good measurement system will
have the lion’s share of variation coming from the product, not the measurement system.
Consequently, we would like the bars for repeatability and reproducibility to be small relative to
part-to-part variation. Figure 7.10 captures the range SPC chart by Operators. The range chart
should be in control. If it is not, a repeatability problem is present. By contrast, the X-bar SPC
chart of Figure 7.11 should be out of control. This seems counterintuitive but it is a healthy
indication that the variability present is due to part to part differences rather than Operator to
Operator differences. Figure 7.12 is an individual value plot of silica concentration by sample
number. The circles with a cross indicate the mean of the sample data and the solid circles are
individual data points. We want a tight grouping around the mean for each sample and we want
significant variation between the means of different samples. If we do not have variation

135
between samples the MSA has been poorly designed and we essentially have five samples of the
same thing. This will preclude analysis of the measurement system. Figure 7.13 is a boxplot of
silica concentration by Operator. As in Figure 7.12 the circles with a cross indicate the mean
concentration for all samples by Operator. The shaded boxes represent the interquartile range
(Q3-Q1) for each Operator. The interquartile range (IQR) is the preferred measure of spread for
data sets which are not normally distributed. The solid line within the IQR is the median silica
concentration of all samples by Operator. If Operators are performing the same, we would
expect similar means, medians and IQRs. Figure 7.14 is an individual value plot used to check
for Operator-Sample interactions. The lines for each Operator should be reasonably parallel to
each other. Crossing lines indicate the presence of Operator-Sample interactions. This can
happen when Operators are struggling with samples at or near the MDL or if the instrument
signal to noise ratio varies as a function of concentration.

Figure 7.9 MSA Components of Variation

Figure 7.10 MSA Range Chart by Operators

136
Figure 7.11 MSA X-bar Chart by Operators

Figure 7.12 MSA Silica Concentration by Sample Number

Figure 7.13 MSA Silica Concentration by Operator

137
Figure 7.14 MSA Sample by Operator Interaction

Let us now focus on the analytical output of the session window as captured in Figure
7.8. Lovers of Gage R&Rs will typically look for four (4) metrics as defined below and expect
these metrics to be within the acceptable or excellent ranges specified by Gage R&R Metric
Rules of Thumb as shown in Figure 7.15.
Equation 7.4 % Contribution
2measurement
% Contribution = X 100 Eqn 7.4
2total

Equation 7.5 % Study Variation


measurement
% Study Variation = X 100 Eqn 7.5
total

Equation 7.6 Two-Sided Spec % Precision to Tolerance Ratio


6measurement
Two-Sided Spec % P/T = X 100 Eqn 7.6
USL - LSL

Equation 7.7 One-Sided Spec % Precision to Tolerance Ratio


3measurement
One-Sided Spec % P/T = X 100 Eqn 7.7
TOL

Equation 7.8 Number of Distinct Categories

1.41total
Number of Distinct Categories = trunc Eqn 7.8
measurement

138
where 2total = Total Variance
2measurement = Variance due to Measurement System
total = Total Standard Deviation
measurement = Standard Deviation due to Measurement System
P/T = Precision to Tolerance Ratio
USL = Upper Spec Limit
LSL = Lower Spec Limit
TOL = Process Mean – LSL for LSL only
TOL = USL – Process Mean for USL only

Gage R&R Metric  Unacceptable Acceptable Excellent 


% Contribution  > 7.7%  2.0 ‐ 7.7%  < 2% 
% Study Variation  > 28%  14 ‐ 28%  < 14% 
% P/T Ratio  > 30%  8 ‐ 30%  < 8% 
Number of Distinct Categories  < 5  5 ‐ 10  > 10 

Figure 7.15 Gage R&R Metrics – Rules of Thumb

The highlighted output of the Minitab session window indicates a % Contribution of the
measurement system of 0.55%. This is in the excellent region. % Study Variation is 7.39%
which is also in the excellent region. Precision to Tolerance ratio is 25.37%. This is in the
acceptable region. Number of distinct categories is 19, well within the excellent region. Overall,
this is a good measurement system. Now, let us proceed to check for linearity and bias by
adding the reference concentrations as measured by the Hong Kong R&D Center for each of the
samples to the worksheet. Figure 7.16 captures the screen shots necessary for this process.

139
Figure 7.16 Gage Linearity and Bias Study Steps – Variable Data

Return to the active worksheet by clicking on Window  Worksheet 1 *** on the top menu. Name the
adjoining column Reference Conc and enter the reference sample concentration values corresponding to
each sample (Part) number.

Click on Stat  Quality Tools  Gage Study  Gage Linearity and Bias Study on the top menu.

140
Select C2 Parts for Part numbers, C5 Reference Conc for Reference values and C4 Silica Conc for
Measurement data in the dialogue box. Click OK.

Gage Linearity and Bias Report for Silica Conc


Reported by:
Gage name: Tolerance:
Date of study: Misc:

Gage Linearity
Predictor Coef SE Coef P
1.0 Regression Constant 0.5443 0.1826 0.005
95% CI Slope 0.00835 0.01234 0.502
Data
Avg Bias
S 0.167550 R-Sq 1.1%
0.8

Gage Bias
Reference Bias P
0.6 Average 0.666667 0.000
12 0.711111 0.000
13.3 0.600000 0.000
Bias

14 0.622222 0.000
16.7 0.677778 0.000
0.4 17.3 0.722222 0.000

0.2

0.0 0

12 13 14 15 16 17
Reference Value

A new graph is created in the Minitab project file with the Gage Linearity and Bias Study results.

141
We can see there is a bias between the Hong Kong measurement system and Minnesota
Polymer’s measurement system. The bias is relatively constant over the silica concentration
range of interest as indicated by the regression line. The Minnesota Polymer measurement
system is reading approximately 0.67 wt % Silica higher than Hong Kong. This is not saying
that the Hong Kong instrument is right and the Minnesota Polymer instrument is wrong. It is
merely saying that there is a difference between the two instruments which must be investigated.
This difference could have process capability implications if it is validated. Minnesota Polymer
may be operating in the top half of the allowable spec range. The logical next step is for the
Hong Kong R&D center to conduct an MSA of similar design, ideally with the same sample set
utilized by Minnesota Polymer.

7.5 Measurement System Analysis – Attribute Data

In our next case we will analyze the measurement system used to rate customer
satisfaction as described in Case Study IV below.

Case Study IV: Virtual Cable Co.


David Raffles Lee has just joined Virtual Cable Co., the leading telecommunications company in the southwest as
Chief Executive Officer. David comes to Virtual Cable with over thirty years of operations experience in the
telecommunications industry in Singapore. During a tour of one of the Customer Service Centers, David noticed
that the customer service agents were all encased in bulletproof glass. David queried the Customer Service
Manager, Bob Londale about this and Bob responded, “It is for the protection of our associates. Sometimes our
customers become angry and they produce weapons.” David was rather shocked about this and wanted to learn
more about customer satisfaction at Virtual Cable. He formed a team to analyze the measurement of customer
satisfaction. This team prepared ten scripts of typical customer complaints with an intended outcome of pass –
customer was satisfied with the customer service agent’s response or fail – customer was dissatisfied with the
response. Twenty “customers” were coached on the scripts, one script for two customers. These customers
committed the scripts to memory and presented their service issue to three different Customer Service Agents at
three different Customer Service Centers. Each customer was issued an account number and profile to allow the
Customer Service Agent to rate the customer’s satisfaction level in the customer feedback database as required by
Virtual Cable’s policy. The results are summarized in the attached table and analyzed by the MSA attribute data
steps of Figure 7.17.

Operator 1 Operator 2 Operator 3


Script # Reference1 Rep 1 Rep 2 Rep 1 Rep 2 Rep 1 Rep 2
1 F F F F F F F
2 P P P P P P P
3 P P P P P P P
4 P P P P P P P
5 F F F F P F F
6 P P P P P P P
7 F F F F F F F
8 F F F F F F F
9 P P F P P F P
10 F F F F F F F

1
Intended outcome of script from Customer Satisfaction Team

142
Figure 7.17 Measurement System Analysis Steps – Attribute Data

Open a new worksheet. Click on Stat  Quality Tools  Create Attribute Agreement Analysis Worksheet
on the top menu.

Enter the Number of samples, the Number of appraisers and the Number of replicates in the dialogue box.
Click OK.

143
The worksheet is modified to include a randomized run order of the scripts (samples).

Name the adjoining columns Response and Reference. Transcribe the satisfaction level rating and the
reference value of the script to the appropriate cells.

144
Click on Stat  Quality Tools  Attribute Agreement Analysis on the top menu.

Select C4 Response for Attribute column, C2 Samples for Samples and C3 Appraisers for Appraisers in the
dialogue box. Select C5 Reference for Known standard/attribute. Click OK.

145
Assessment Agreement Date of study:
Reported by:
Name of product:
Misc:

Within Appraisers Appraiser vs Standard


100 95.0% CI 100 95.0% CI
Percent Percent

90 90
Percent

Percent
80 80

70 70

60 60

1 2 3 1 2 3
Appraiser Appraiser

A new graph is created in the Minitab project file with the Attribute Assessment Agreement results.

Display the analytical MSA Attribute Agreement Results by clicking on Window  Session on the top
menu.

146
The attribute MSA results allow us to determine the percentage overall agreement, the
percentage agreement within appraisers (repeatability), the percentage agreement between
appraisers (reproducibility), the percentage agreement with reference values (accuracy) and the
Kappa Value (index used to determine how much better the measurement system is than random
chance).
From the graphical results we can see that the Customer Service Agents were in
agreement with each other 90% of the time and were in agreement with the expected (standard)
result 90% of the time. From the analytical results we can see that the agreement between
appraisers was 80% and the overall agreement vs the standard values was 80%. The Kappa
Value for all appraisers vs the standard values was 0.90, indicative of excellent agreement
between the appraised values and reference values. Figure 7.18 provides benchmark
interpretations for Kappa Values.

Attribute MSA - Kappa Value

Kappa Value Interpretation


-1.0 to 0.6 Agreement expected as by chance
0.6 to 0.7 Marginal agreement - significant effort required to improve measurement system
0.7 to 0.8 Good agreement - some improvement to measurement system is warranted
0.9 to 1.0 Excellent agreement

Figure 7.18 Rules of Thumb for Interpreting Kappa Values

Another way of looking at this case is that out of sixty expected outcomes there were
only three miscalls on rating customer satisfaction by the Customer Service Agents included in
this study. Mr. Lee can have confidence in the feedback of the Virtual Cable customer
satisfaction measurement system and proceed to identify and remedy the underlying root causes
of customer dissatisfaction.

7.6 Improving the Measurement System

Improvements to the measurement system should be focused on the root cause(s) of high
measurement system variation. If repeatability is poor, consider a more detailed repeatability
study using one part and one operator over an extended period of time. Ask the operator to
measure this one sample twice per day for one month. Is the afternoon measurement always
greater or always lesser than the morning measurement? Perhaps the instrument is not
adequately cooled. Are the measurements trending up or down during the month? This is an
indication of instrument drift. Is there a gold standard for the instrument? This is one part that
is representative of production parts, kept in a climate-controlled room, handled only with gloves
and carried around on a red velvet pillow. Any instrument must have a gold standard. Even the
kilogram has a gold standard. It is a platinum-iridium cylinder held under glass at the Bureau
International des Poids et Mesures in Sèvres, France. If the gold standard measures differently

147
during the month the measurement error is not due to the gold standard, it is due to the
measurement system. Consider if the instrument and/or samples are affected by temperature,
humidity, vibration, dust, etc. Set up experiments to validate these effects with data to support
your conclusions. If you are lobbying for the instrument to be relocated to a climate-controlled
clean room you better have the data to justify this move.
If reproducibility is poor, read the Standard Operating Procedure (SOP) in detail. Is the
procedure crystal clear without ambiguity which would lead operators to conduct the procedure
differently? Does the procedure specify instrument calibration before each use? Does the
procedure indicate what to do if the instrument fails the calibration routine? Observe the
operators conducting the procedure. Are they adhering to the procedure? Consider utilizing the
operator with the lowest variation as a mentor/coach for the other operators. Ensure that the SOP
is comprehensive and visual. Functional procedures should be dominated by pictures, diagrams,
sketches, flow charts, etc which clearly demonstrate the order of operations and call out the
critical points of the procedure. Avoid lengthy text SOP’s devoid of graphics. They do not
facilitate memory triangulation – the use of multiple senses to recall learning. Refresher training
should be conducted annually on SOP’s with supervisor audit of the Operator performing the
measurement SOP.

7.7 Long Term Stability

Now that you have performed analyses to establish the Instrument Detection Limit,
Method Detection Limit, Accuracy, Linearity, and Gage R&R metrics of your measurement
system and proven that you have a healthy measurement system; you will need to monitor the
measurement system to ensure that it remains healthy. Stability is typically monitored through
daily measurement of a standard on the instrument in question. If a standard is not available, one
of the samples from the Gage R&R can be utilized as a “Golden Sample”. Each day, after the
instrument is calibrated, the standard is measured on the instrument. An Individuals Moving
Range (IMR) SPC chart is generated as we have covered in Chapter 6. If the standard is in
control then the measurement system is deemed to be in control and this provides the
justification to utilize the instrument to perform commercial analyses on process samples
throughout the day. If the standard is not in control the instrument is deemed to be
nonconforming and a Root Cause Analysis must be initiated to identify the source(s) of the
discrepancy. Once the discrepancy has been identified and corrected, the standard is re-run on
the instrument and the IMR chart refreshed to prove that the instrument is in control. Figure 7.19
shows daily stability measurements from Case Study III, silica concentration measurement of
Golden Sample disk number two.

148
I-MR Chart of Golden Sample 2 Silica Conc
15.2
UCL=15.101
Individual Value

14.8
_
X=14.67

14.4

LCL=14.239
1 4 7 10 13 16 19 22 25 28
Day

0.6
UCL=0.5295
Moving Range

0.4

0.2 __
MR=0.1621

0.0 LCL=0
1 4 7 10 13 16 19 22 25 28
Day

Figure 7.19 Measurement System Long Term Stability

7.8 Metrology Correlation and Matching

Metrology correlation is utilized when comparing two measurement systems. This


includes the sample preparation steps required before the actual measurement is conducted as
this is part of the measurement system. Metrology correlation and matching assessment is
performed when replacing an existing metrology tool with a new metrology tool, expanding
measurement capacity by adding a second tool, comparing customer metrology to supplier
metrology or comparing a metrology tool at one site to a metrology tool at another site.
Metrology correlation analysis is conducted when the two metrology tools are not required to
deliver the exact same output. This occurs when the equipment, fixtures, procedures and
environment of the two measurement tools are not exactly the same. This is a common situation
when comparing customer metrology to supplier metrology. Metrology matching analysis is
conducted when the two metrology tools are required to deliver exactly the same output. This is
a typical condition where a specification exists for a critical quality characteristic.
Before conducting metrology correlation and matching there are some prerequisites. You
must ensure metrologies are accurate, capable, and stable. This means that the two measurement
systems under consideration must have passed the success criterion for instrument detection
limit, method detection limit, accuracy, linearity, Gage R&R and long term stability. Correlation
and matching is most likely to be successful if the measurement procedures are standardized.

149
Select a minimum of sixteen samples to be measured on both metrology tools. Samples should
be selected such that they span the measurement range of interest (for example – the spec range).
Avoid clustered samples around a certain measurement value. If necessary, manufacture
samples to cover the spec range. It is acceptable to include out of spec high and low samples.
In order for two measurement systems to be correlated, R-squared of the least squares
regression line of the current instrument vs the proposed instrument must be 75% or higher. If
matching is desired, there are two additional requirements; the 95% confidence interval of the
slope of the orthogonal regression line must include a slope of 1.0 and a paired t-Test passes (ie
95% confidence interval of mean includes zero). This ensures that bias between the two
instruments is not significant.
Let us revisit Penelope Banks at Minnesota Polymer to better understand metrology
correlation and matching protocol. Penny has requisitioned a redundant XRF-EDS to serve as a
critical back-up to the existing XRF-EDS instrument and to provide analysis capacity expansion
for the future. She has been submitting samples for analysis to both instruments for the last
sixteen weeks and has collected the following results. Please refer to Figure 7.20 for correlation
and matching analysis steps.

Sample No. XRF-EDS1 XRF-EDS2


160403-2359D 14.2 14.4
160410-1600A 15.3 15.1
160414-0200B 13.7 13.5
160421-1400C 16.8 17.0
160427-0830C 13.5 13.3
160504-0300D 15.1 15.1
160510-1030A 13.3 13.2
160518-0100B 16.4 16.2
160525-1615C 16.6 16.5
160601-2330D 14.3 14.5
160608-0500D 15.7 15.9
160616-1330A 13.8 13.6
160625-1515C 15.7 15.8
160630-0420D 16.2 16.0
160707-2230B 13.5 13.7
160715-1920B 16.8 17.0

150
Figure 7.20 Metrology Correlation and Matching Steps

Open a new worksheet. Copy and paste the measurement data from the two instruments into the
worksheet.

Click on Graph → Scatterplot on the top menu.

151
Select With Regression in the dialogue box. Click OK.

Select the reference instrument XRF-EDS1 for the X variables and XRF-EDS2 for the Y variables. Click
OK.

152
Scatterplot of XRF-EDS2 vs XRF-EDS1

17

16
XRF-EDS2

15

14

13
13 14 15 16 17
XRF-EDS1

A scatter plot is produced with least squares regression line.

Hover your cursor over the least squares regression line. The R-sq = 98.1%. Correlation is good.

153
Return to the worksheet. Click on Stat → Regression → Orthogonal Regression on the top menu.

Select the reference instrument XRF-EDS2 for the Response (Y) and XRF-EDS1 for the Predictor (X)
variables. Click Options.

154
Select 95 for the Confidence level. Click OK → then click OK one more time.

Plot of XRF-EDS2 vs XRF-EDS1 with Fitted Line

17

16
XRF-EDS2

15

14

13
13 14 15 16 17
XRF-EDS1

A scatter plot is produced with orthogonal regression line.

155
Click on Window → Session on the top menu. The session window indicates that the 95% Confidence
Interval of the slope includes 1.0. The two instruments are linear in accuracy.

Return to the worksheet. Click on Stat → Basic Statistics → Paired t on the top menu.

156
Select XRF-EDS1 for Sample 1 and XRF-EDS2 for Sample 2 in the dialogue box. Click Options.

Select 95.0 for Confidence level. Select 0.0 for Hypothesized difference. Select Difference ≠ hypothesized
difference for Alternative hypothesis in the dialogue box. Click OK. Then click OK one more time.

157
The session window indicates that the 95% confidence interval for the mean difference includes zero. The
P-Value for the paired t-Test is above the significance level of 0.05. Therefore we may not reject the null
hypothesis. There is no significant bias between the two instruments.

Penelope has proven that XRF-EDS2 is correlated and matched to XRF-EDS1. She may
now use XRF-EDS2 for commercial shipment releases including Certificates of Analysis to her
customers.

158

You might also like