Professional Documents
Culture Documents
Errors-in-Variables Regression
Raymond Covert
Technical Director
MCR, LLC
rcovert@mcri.com
Presented to the
European Aerospace Working Group on
Cost Engineering (EACE)
Frascati, Italy
24-25 April 2007
08/10/09 Reprinted with permission of MCR, LLC
© MCR, LLC
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 2
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 3
Introduction
08/10/09
© MCR, LLC 4
Background
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 9
Regressing Constant Variables
c
Cost
Cost==aa++bx
bX c
08/10/09
© MCR, LLC 12
EIV Using Crystal Ball
with OptQuest®
• Uncertain variables can be modeled in spreadsheet using
statistical simulation tool (Crystal Ball) with optimization
capability (OptQuest®)
• Random variables defined for uncertain variables that
constitute x,y data points - cost drivers, normalization
assumptions
• Outputs (forecasts) from Statistical Simulation defined for
Bias and Percent Standard Error
• CER coefficients defined as decision variables - Find
optimum coefficients that give minimum mean of standard
error under (near) zero bias constraint
• During optimization, random Variables are generated for x
and y variables - Examples that follow use 5000 trials
• CER Coefficients are tested for each set of (5000) trials
• OptQuest® determines optimum coefficients using scatter
search and tabu search techniques (does not find minima
via gradient approach)
08/10/09
© MCR, LLC 13
Optimizing Using Crystal Ball
and Premium Solver
• Model uncertain variables in spreadsheet using
statistical simulation tool such as Crystal Ball
• Random variables defined for uncertain variables that
constitute x,y data points
– Cost driver uncertainty
– NR/REC split
– Quantities (EDUs, Qual and Protoqual units)
– Inflation
• Outputs (forecasts) from Statistical Simulation are
defined
– Uncertain Input variables (cost drivers, quantities)
– Output variables (nonrecurring and recurring costs)
– Trial values (1000 trials) for each are dumped into a spreadsheet
• Data are regressed using constrained optimization
(Premium Solver)
– Uses a combined scatter search and gradient approach to find global
minimum for percent error under the constraint bias =0
– Produces coefficients for CER
08/10/09
© MCR, LLC 14
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– USCM EPS NR and REC CER Example
• Spacecraft Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 15
Sources of Uncertainty
Data CER
Data Collection Normalization Development CER Documentation
COST REPORT
ACOST
---- ---- REPORT
BA---- ----
COST----
---- REPORT
CB------------
A----
COST ----
----
REPORT Technology Regression
DC----
DB
A --------
B----
COST
C----
----
----
REPORT CER functions
--------
A---- ----
----
DB----
C ---- ----
---- and coefficients
DC----
---- ----
---- Inflation Statistics
D ---- ----
Data points
SCHED. REPORT
ASCHED.
---- ---- REPORT
BA----
---- ---- WBS Definition
SCHED.
CB----
A----
----
---- REPORT
----
----
SCHED.---- REPORT
Quantity
DC----
B---- ----
---- ----
A ------------ Normalization
DC----
---- ----
B ------------
D ----
C ------------ Assumptions
D ---- ---- Scope
Fit Statistics
DESIGN
REVIEW
DESIGN COST DRIVERS
Data Statistics
Filter
REVIEW
DESIGN A ---- ----
REVIEW
DESIGN B ---- ----
REVIEW C ---- ----
D ---- ----
Contract
“Riders”
Sources of Uncertainty (in red)
08/10/09
© MCR, LLC 17
Uncertainty in Data Normalization
and 3600)
• Technology: Consistent technology maturity
Contract – Treat all data as if they were built in economic year of
“Riders” the model
08/10/09
© MCR, LLC 18
The EIV Problem
with “Fuzzy” Inputs
08/10/09
© MCR, LLC 19
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 20
CER Development Example
08/10/09
© MCR, LLC 21
Sources of Uncertainty in
CER-Development Example
“Then Year”
Cost Units
Fuzzy definition of
Different
Center frequency, “new design”
Fiscal Years
low, or high cutoff using ordinal scale
(midyear,
frequency? Assumed
Peak, average, peak year?)
learning
+ 0.5 m? or sustainable Inflation rates?
curve?
slew rate?
08/10/09
© MCR, LLC 22
Determining CER Coefficient Values
Using Constant (x, y) Values
08/10/09
© MCR, LLC 24
EIV Solution for T1 CER
08/10/09
© MCR, LLC 25
Actual vs. Estimated
Plot of T1 CER
08/10/09
© MCR, LLC 26
Contributors to Variance in
Dependent Variables
Frequency
errors
Learning
errors
Inflation
errors
Diameter
errors
08/10/09
© MCR, LLC 27
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 28
Regression With Uncertain Variables:
Nonrecurring CER
08/10/09
© MCR, LLC 30
EIV Solution for
Nonrecurring CER
08/10/09
© MCR, LLC 31
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 32
Spacecraft EPS NR and REC CER
08/10/09
© MCR, LLC 33
Candidate Solutions
08/10/09
© MCR, LLC 34
Relationship of
Candidate Coefficients
Coef b1
300.00 0.500
0.400
200.00
0.300
0.200
100.00
0.100
0.00 0.000
-1200 -1000 -800 -600 -400 -200 0 200 -1200 -1000 -800 -600 -400 -200 0 200
Coef a1 Coef a1
08/10/09
© MCR, LLC 35
Resulting REC CER
250,000
200,000
EIV Estimated Cost (FY06$K)
150,000
100,000
50,000
0
0 50,000 100,000 150,000 200,000 250,000
Normalized Actual Cost (FY06$K)
08/10/09
© MCR, LLC 36
Minimum NR+REC Problem
100,000
10,000
1,000
1,000 10,000 100,000 1,000,000
Normalized Actual Cost (FY06$K)
08/10/09
© MCR, LLC 38
EPS REC CER
100,000
10,000
1,000
1,000 10,000 100,000 1,000,000
Normalized Actual Cost (FY06$K)
08/10/09
© MCR, LLC 39
New Questions Arise
08/10/09
© MCR, LLC 40
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 41
EIV Modeling
Benefits and Drawbacks
08/10/09
© MCR, LLC 42
Agenda
• Introduction
• Errors-In-Variables Regression
• Sources of Uncertainty
• Examples
– CER Regression with Normalization Uncertainty
– CER Regression with Fuzzy Cost Drivers
– Spacecraft EPS NR and REC CER Example
• EIV Modeling Benefits and Drawbacks
• Summary
08/10/09
© MCR, LLC 43
Summary
08/10/09
© MCR, LLC 44
References
Further Reading:
• Quirino, P., "Robust Estimators of Errors-In-Variables Models Part 1"
(August 1, 2004), Department of Agricultural & Resource Economics (ARE),
University of California at Davis, ARE Working Papers, Paper 04-007.
• van Huffel, S.; Lemmerling, P. (Eds.), “Total Least Squares and Errors-in-
Variables Modeling: Analysis, Algorithms and Applications,” Springer
Verlag, 2002, ISBN: 1-4020-0476-1.
• Griliches, Z., "Errors in Variables and Other Unobservables," Econometrica,
Econometric Society, vol. 42(6), pages 971-98, November 1974.
• Pollock, D.S.G., “Topics in Econometrics: the Errors in Variables Model and
the Linear Regression Model”, unpublished notes, p. 1-4,
http://www.qmw.ac.uk/~ugte133/courses/mesomet/topics/ectopics.htm
08/10/09
© MCR, LLC 45