You are on page 1of 11

Automation in Construction 19 (2010) 619–629

Contents lists available at ScienceDirect

Automation in Construction
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / a u t c o n

Estimate at Completion for construction projects using Evolutionary Support Vector


Machine Inference Model
Min-Yuan Cheng a, Hsien-Sheng Peng b,⁎, Yu-Wei Wu a, Te-Lin Chen a
a
Department of Construction Engineering, National Taiwan University of Science and Technology, Taiwan
b
Ecological and Hazard Mitigation Engineering Research Center, National Taiwan University of Science and Technology, Taiwan

a r t i c l e i n f o a b s t r a c t

Article history: Construction projects are influenced by a range of factors that impact upon final project cost. Estimate at
Accepted 24 February 2010 Completion (EAC) is an important approach used to estimate final project cost, which takes into
consideration probable project performance and risks. EAC helps project managers identify potential but
Keywords: still unknown problems and adopt response strategies. This study constructed an evolutionary EAC model to
Estimate at Completion generate project cost estimates that proved significantly more reliable than estimates achievable using
Fast Messy Genetic Algorithms
currently prevailing formulae. The developed learning model fused two artificial intelligence approaches,
Support Vector Machine
namely the fast messy genetic algorithm (fmGA) and Support Vector Machine (SVM), to create an
Evolutionary Support Vector Machine Inference Model (ESIM). The ESIM was then applied to estimate final
project costs for historical cases. Finally, using the EAC estimate, project cost influence indices, and project
cost diagrams, the discrepancy between estimate and practical values was examined to determine potential
problems in order to help project managers better control project costs. The learning results were validated
in real applications that showed good performance for training models. Providing project managers reliable
EAC trend estimates is helpful for their effective control of project costs and taking appropriate peremptory
measures to handle potential problems.
© 2010 Elsevier B.V. All rights reserved.

1. Introduction The Estimate at Completion (EAC) is a quick and automatic formula


used by managers to assess the cost of work to complete schedule
Engineering projects face myriad uncertainties attributable to activities [1]. Many researches have already been done in this area and
chosen construction method as well as environmental and process the methodologies that the existing previous works used were various.
factors [1,2]. Construction firms typically focus only on budget Barraza et al. [3] used the concept of stochastic S curves (SS curves) to
planning during the initial project stage, which ignores engineering determine forecasted project estimates as an alternative to using
cost changes, information updates and cost management during deterministic S curves and traditional forecasting methods. A simulation
construction and, in turn, prevents effective project cost control and approach is used for generating the stochastic S curves, and it is based on
the identification of potential problems. Cost overruns are frequently the defined variability in duration and cost of the individual activities
not discovered until later project stages, at which time it is typically too within the process. Lee [4] introduced a software, Stochastic Project
late to take effective remedial measures. Scheduling Simulation (SPSS), developed to measure the probability to
Effective project management requires that plans be constantly complete a project in a certain time specified by the user. The SPSS finds
revised in accordance with actual project conditions. However, factors the longest path in a network and runs the network a number of times
that influence project cost are numerous, and it is difficult to consider specified by the user and calculates the stochastic probability to
individually each factor of influence at each stage. Moreover, data on complete the project in the specified time. Lee and Arditi [5] described
construction cost are manifold and variability is great. In order to a stochastic simulation-based scheduling system (S3) that integrates
update cost information item by item in a timely manner, management the deterministic critical path method (CPM), the probabilistic program
must adopt an efficient approach to the issue and invest significant evaluation and review technique (PERT), and the stochastic discrete
time. event simulation (DES) approaches into a single system. The system is
based on an earlier version of the system called Stochastic Project
Scheduling Simulation and makes use of all the capabilities of this
⁎ Corresponding author. #43, Sec. 4, Keelung Rd., Taipei, 106, Taiwan. Tel.: +886 2
system. Kim and Reinschmidt [6] introduced a new probabilistic
27301212; fax: +886 2 27301074. forecasting method for schedule performance control and risk manage-
E-mail address: hspeng@mail.ntust.edu.tw (H.-S. Peng). ment of on-going projects. The Bayesian betaS-curve method (BBM) is

0926-5805/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.autcon.2010.02.008
620 M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629

Fig. 1. ESIM framework.

based on Bayesian inference and the beta distribution. The BBM data envelope analysis (DEA) approach to evaluate project perfor-
provides confidence bounds on predictions, which can be used to mances in a multi-project environment, which evaluates projects using
determine the range of potential outcomes and the probability of earned value management system (EVMS) and multidimensional
success. control system (MPCS) methods. Vanhoucke and Vandevoorde [11]
Earned Value Management (EVM) is one of the theoretical methods extensively reviewed and evaluated earned value (EV)-based methods
for determining EAC that was originally developed for cost management to forecast the total project duration, and investigated the potential of a
and later adopted to forecast project duration. Kim et al. [7] indicated recently developed method, the earned schedule method, which
that EVM is gaining wider acceptance due to increasing recognition of its improves the connection between EV metrics and the project duration
ability both to diminish EVM problems and improve utilities. A broader forecasts. Lipke et al. [12] provided a method to improve the capability of
approach was developed that considers the four-factor groups (i.e. EVM project managers to make informed decisions by providing a reliable
users, EVM methodology, project environment, implementation pro- method to forecast final cost and duration. Their method and its
cess) together to improve significantly the acceptance and performance evaluation made use of a well established project management method,
of EVM in different types of organizations and projects. Cioffi [8] a recent technique developed to analyze schedule performance, and
presented a new formula and corresponding notation for earned value statistical mathematics to develop EVM, earned schedule (ES) and
analysis that makes earned value calculations more transparent and statistical prediction and testing methods. Plaza and Turetken [13]
flexible. Vandevoorde and Vanhoucke [9] compare the classic earned
value performance indicators SV and SPI with the newly developed
Table 1
earned schedule performance indicators SV(t) and SPI(t), and then Influencing factors of construction cost.
present a generic schedule forecasting formula applicable in different
project situations and compare the three methods from literature to Classification Influencing Index Definition
factor
forecast total project duration. Finally, the use of each method was
illustrated on a simple one activity example project and on real-life Time now Construction Construction Duration to date/revised
duration progress contract duration
project data. Vitner et al. [10] investigated the possibility of using the
percentage
Actual cost ACP Actual cost/budget at
completion
Planned cost EVP Earned value/budget at
completion
Construction Cost CPI Earned value/actual cost
management management
Time SPI Earned value/planned value
management
Subcontractor Subcontractor Subcontractor billed amount/
management billed index actual cost
Contract Contract Owner billed Owner billed amount/earned
scope payment index value
Change order Change order Revised contract amount/
index budget at completion
External Construction CCI Construction material price
environment price fluctuation index of that month/
construction material price
index of initial stage
No. of rainy Climate effect (Revised project duration —
day index no. of rainy day)/revised
project duration
Fig. 2. EAC prediction model.
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629 621

Table 2
Historical cases.

Project name Total area Underground Ground Buildings Start date Finish Duration Contract amount ESIM prediction Note
(m2) floors floors date (days) (NTD) periods

A 12,622 2 9 1 2003/12/1 2005/8/22 630 289,992,000 29


B 4919 3 11 1 2003/12/13 2005/11/10 698 149,300,000 24
C 19,205 5 8 1 2000/5/20 2002/5/19 729 332,800,000 20
D 5358 3 9 1 2000/11/15 2002/11/14 729 199,600,000 25
E 27,468 2 11 3 1999/12/16 2001/12/3 718 1,142,148,388 26
F 31,797 2 9 4 2001/7/4 2003/3/31 635 530,000,000 20
G 7707 2 14 1 2001/11/24 2003/10/20 695 153,500,000 22
H 10,087 3 14 1 2002/6/18 2004/7/6 749 216,000,000 27
I 3479 1 10 1 2003/6/2 2004/9/30 486 85,714,286 18
J 6352 4 11 1 2004/3/5 2006/2/18 715 202,241,810 31
K 4774 2 11 1 2004/2/21 2006/2/20 730 145,377,589 27
L 7289 2 8 1 2005/6/15 2006/9/15 457 190,844,707 20 Testing case
M 3094 2 7 1 2005/10/1 2007/2/28 515 102,500,000 17 Testing case
Total 306
Subtotal (training cases) 269
Subtotal (testing cases) 37

proposed an extended version of EVM (EVM/LC) that addresses the by the single method are extremely accurate for some and present
effect of learning on the performance of project teams. A spreadsheet- obvious errors for others. This has created confusion in the industry as to
based decision support tool that automates calculations and analyses which kind of prediction method should be chosen for particular project
was presented in EVM/LC. Leu and Lin [14] attempts to refine and types. Another issue with EVM is that it must be applied to each distinct
improve the performance of traditional EVM by the introduction of construction project process, with revisions conducted manually. Such
statistical control chart techniques. Individual control charts are used as makes EVM both complicated and time consuming. Consequently,
tools to monitor project performance data in order to detect adverse computerization of the engineering management process is critical if
changes in a timely manner. EVM is to be applied effectively to control construction costs. However,
At least eight common methods have already been put forward that most construction firms in Taiwan use computer systems powerful
use EVM to predict the EAC for construction projects [15,16]. Each has enough only to analyze initial stage budgets. Systems are not equipped to
been applied to different special projects and achieved differing EAC error react to changes at each construction stage or use the EVM method to
rates. When applied to different special projects, the predictions achieved predict construction project EAC.

Table 3
Input variables.

Training set name Construction progress ACP EVP CPI SPI Subcontractor Owner billed Change order CCI Climate effect
and no percentage billed index index index index

Project-A-1 3.8% 0.0% 0.0% 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Project-A-2 7.9% 1.8% 3.6% 2.01 0.69 1.00 1.00 1.00 1.03 0.99
Project-A-3 11.7% 3.4% 7.7% 2.24 0.89 1.67 0.75 1.00 1.09 0.98
Project-A-4 15.8% 4.9% 10.1% 2.05 0.99 1.33 0.65 1.00 1.12 0.97
Project-A-5 19.9% 9.0% 11.4% 1.27 0.99 0.73 0.57 1.00 1.11 0.96
Project-A-6 23.8% 12.1% 12.4% 1.02 1.00 1.07 1.05 1.00 1.10 0.95
Project-A-7 27.9% 13.8% 13.7% 1.00 0.99 1.06 1.06 1.00 1.10 0.95
Project-A-8 31.8% 14.5% 15.2% 1.04 1.00 1.14 1.09 1.00 1.12 0.94
Project-A-9 35.9% 18.2% 16.7% 0.92 1.02 0.91 0.99 1.00 1.13 0.92
Project-A-10 40.0% 22.8% 16.7% 0.73 1.02 0.68 0.93 1.00 1.13 0.91
Project-A-11 44.0% 26.9% 23.0% 0.86 0.93 0.64 0.75 1.00 1.14 0.90
Project-A-12 48.0% 32.0% 30.3% 0.95 1.02 0.71 0.75 1.00 1.13 0.90
Project-A-13 52.0% 37.4% 32.9% 0.88 0.94 0.69 0.78 1.00 1.12 0.90
Project-A-14 56.1% 40.3% 39.1% 0.97 0.95 0.74 0.77 1.00 1.12 0.89
Project-A-15 59.9% 42.1% 43.0% 1.02 0.95 0.81 0.79 1.00 1.12 0.88
Project-A-16 67.9% 47.4% 57.0% 1.20 0.94 0.84 0.70 1.00 1.13 0.87
Project-A-17 71.8% 53.1% 66.8% 1.26 0.95 0.86 0.68 1.00 1.12 0.86
Project-A-18 75.9% 62.9% 76.1% 1.21 0.96 0.87 0.72 1.00 1.11 0.84
Project-A-19 79.9% 71.2% 85.8% 1.20 0.97 0.88 0.73 1.00 1.11 0.83
Project-A-20 84.0% 83.5% 97.3% 1.16 0.99 0.97 0.83 1.00 1.11 0.82
Project-A-21 88.0% 92.0% 98.2% 1.07 0.99 0.88 0.82 1.00 1.12 0.81
Project-A-22 92.0% 94.1% 98.5% 1.05 0.99 0.95 0.91 1.00 1.12 0.80
Project-A-23 96.1% 97.1% 100.2% 1.03 0.99 0.92 0.89 1.01 1.12 0.80
Project-A-24 100.0% 98.9% 100.2% 1.01 0.99 0.91 0.89 1.01 1.12 0.79
Project-A-25 104.1% 100.2% 101.4% 1.01 1.00 1.01 1.00 1.01 1.12 0.79
Project-A-26 107.9% 100.4% 101.4% 1.01 1.00 1.01 1.00 1.01 1.13 0.79
Project-A-27 111.8% 100.6% 101.4% 1.01 1.00 1.01 1.00 1.01 1.14 0.78
Project-A-28 115.9% 104.1% 101.4% 0.97 1.00 0.97 1.00 1.01 1.16 0.76
Project-A-29 119.9% 108.7% 101.4% 0.93 1.00 0.93 1.00 1.01 1.20 0.75
Project-B-1 2.4% 0.0% 0.0% 1.00 1.00 1.00 1.00 1.00 1.00 1.00
Project-B-2 6.9% 2.5% 0.0% 1.00 1.00 1.16 1.00 1.00 1.03 0.99
Project-B-3
Project-M-15
Project-M-16 105.8% 84.4% 99.9% 1.18 1.00 1.08 0.92 1.00 1.13 0.76
Project-M-17 111.8% 92.5% 95.9% 1.04 1.00 1.04 1.00 0.96 1.13 0.75
622 M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629

Fig. 4. Predicted output and actual output of estimate to completion for case L.

Fig. 3. Predicted output and actual output of training data.


problems. SVC utilizes existing data to do training and then selects
several support vectors by analyzing the training data to represent the
In light of the above, the development of a fast and effective system whole data. Some extreme values were eliminated in advance. Finally,
that considers the uncertainty and myriad problems involved in cost selected support vectors were packed into a model and SVC was used to
control over the course of a project course while predicting carry out classification on testing data. The concept of Support Vector
construction project EAC is an important issue to be resolved. This Regression (SVR) is similar to that of SVC. It maps regression problems
research aims to resolve questions encountered in project cost from low dimensional to high dimensional vector spaces to identify the
management by collecting relevant papers in the literature and support vector in which the appropriate linear regression equation
historical cases in order to identify the set of factors that influence could be obtained.
significantly on project cost. A project cost flow trend was then set up
using historical case data. The relationship between monthly costs 2.2. Fast Messy Genetic Algorithms (fmGA)
and project EAC was mapped based on past knowledge and
experience. An Evolutionary Support Vector Machine Inference The Simple Genetic Algorithm (sGA), an efficient and accurate
Model (ESIM) was established using historical case experience to algorithm, was first developed by Holland in 1975. Goldberg et al.
predict and control EAC variation tendency of the project during subsequently developed the Messy Genetic Algorithm (mGA) in 1989
construction. Finally, indices of project cost and project cost diagrams in order to improve sGA shortcomings. Several experiments [17,19]
were employed to identify the reason for the difference between have since shown the mGA much better at solving permutation
actual and ESIM-predicted values so that potential problems can be problems than sGA. In 1993, Goldberg established the Fast Messy
found and effective control and management actions be implemented Genetic Algorithm (fmGA) to reduce the high memory consumption
at proper points in time. of operation processes [20].
The mGA resolved the problem that sGA does not consider logical
2. Evolutionary Support Vector Machine Inference Model (ESIM) limitations amongst gene bunches during the optimization process.
There are four main differences in solving mechanisms between fmGA
2.1. Support Vector Machine (SVM) and sGA [21,22]. The first is that chromosomes of variable length could
be adopted in fmGA. Secondly, simple cut and splice are used to replace
Support Vector Machine (SVM) is a computer training technique the sGA operator mechanism. Thirdly, the optimization process includes
popularized in recent years. It is based on a statistics learning theory a primordial and juxtapositional stage. Lastly, competitive templates are
described by Vapnik [17]. Traditional training techniques usually focus adopted to retain the most outstanding gene building blocks in each
on minimizing empirical risk; i.e., minimizing the classification error of generation [23].
training data. However, SVM aims to minimize the structural risk in
finding a probable upper bound of the classification error of training 2.3. ESIM framework
data [18]. This new computer training technique effectively minimizes
the upper bound of theoretical error. The Evolutionary Support Vector Machine Inference Model (ESIM)
Data classification and regression, two critical components of fuses SVM and fmGA [20,24,25]. In this model, SVM is used to sum up
computer science, are being used in increasingly broad and general the complicated relationship between input parameters and output
applications. Traditional classification methods include neural networks parameters, while fmGA searches for the best parameters (C and γ)
(NNs), decision trees and nearest neighbour method, among others. needed by SVM needs in order to improve SVM prediction accuracy.
SVM is a new method that has already proved its value through good The framework of SVM is shown in Fig. 1.
results in many applications. SVM has relatively firmer theoretical Steps are explained as follows:
foundations than NNs. Support Vector Classification (SVC) is founded on Default C, γ: The value of C and γ may be set up differently to
the principle of minimizing training theoretical structure risk. An reflect case and problem characteristics. C and γ can be selected as 1
important advantage of SVC is its ability to handle linear inseparable and 1/M respectively, where M stands for parameter number.

Table 4
Validation of training cases.

Project name A B C D E F G H I J K Total

Number of periods 29 24 20 25 26 20 22 27 18 31 27 269


Average error of EACP 8.7% 5.6% 4.1% 5.6% 6.2% 7.6% 3.6% 14.0% 5.0% 4.2% 3.7%
Qualified (b10%) Yes Yes Yes Yes Yes Yes Yes No Yes Yes Yes 91%
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629 623

Table 5
(a) Testing results of case L. (b) Testing results of case M.

Training set name Predicted Actual EACP EACP e (error of


and no output output (predicted) (actual) EACP)

(a)
Project L-1 0.76985 0.72710 89.76% 84.18% 6.63%
Project L-2 0.75566 0.72230 88.53% 84.18% 5.17%
Project L-3 0.69682 0.70360 83.28% 84.18% 1.06%
Project L-4 0.66103 0.63210 87.95% 84.18% 4.48%
Project L-5 0.62353 0.59440 87.98% 84.18% 4.52%
Project L-6 0.57243 0.54420 87.86% 84.18% 4.38%
Project L-7 0.52164 0.50640 86.17% 84.18% 2.37%
Project L-8 0.47307 0.44560 87.75% 84.18% 4.25%
Project L-9 0.43308 0.40210 88.22% 84.18% 4.80%
Project L-10 0.40248 0.37220 88.12% 84.18% 4.68%
Project L-11 0.42472 0.33140 96.34% 84.18% 14.45%
Fig. 5. ESIM-predicted and actual EAC values of case L.
Project L-12 0.39262 0.29480 96.93% 84.18% 15.16%
Project L-13 0.37184 0.25570 99.32% 84.18% 18.00%
Project L-14 0.33271 0.22340 98.42% 84.18% 16.92%
Search for fmGA parameters: In this step, fmGA searches for relatively
Project L-15 0.30792 0.17890 101.00% 84.18% 19.98%
Project L-16 0.13018 0.15060 81.52% 84.18% 3.16%
appropriate C and γ to serve as parameters for the next generation.
Project L-17 0.11847 0.12780 82.96% 84.18% 1.44% Optimized parameters: According to the above-mentioned opti-
Project L-18 0.12773 0.10770 86.79% 84.18% 3.11% mization calculations, the best gene set will be retained. The optimum
Project L-19 0.09538 0.08770 85.18% 84.18% 1.19% inference model is obtained after gene set decoding as the C and γ
Project L-20 0.05293 0.08330 80.21% 84.18% 4.71%
value for SVM type.
Average error 7.02%

(b) 3. Constructing EAC prediction model using ESIM (EAC–ESIM)


Project M-1 0.74012 0.72410 94.58% 84.18% 2.26%
Project M-2 0.65222 0.65710 91.85% 84.18% 0.69%
During the construction phase of the project, planning objectives and
Project M-3 0.61894 0.62030 92.32% 84.18% 0.18%
Project M-4 0.57882 0.59120 90.88% 84.18% 1.74%
achieved percentage are affected by actual work conditions, design
Project M-5 0.56265 0.61940 85.09% 84.18% 8.00% changes and external environmental conditions. Adjustments that result
Project M-6 0.51792 0.51410 92.99% 84.18% 0.55% to reflect such can generate differences between planned and actual
Project M-7 0.49309 0.48480 93.58% 84.18% 1.18% completion costs.
Project M-8 0.45539 0.44820 93.42% 84.18% 1.01%
The EAC prediction is generally based on the construction budget
Project M-9 0.41551 0.41580 92.45% 84.18% 0.04%
Project M-10 0.39051 0.39570 91.81% 84.18% 0.73% developed based on initial project conditions (i.e., the construction
Project M-11 0.29915 0.31270 90.71% 84.18% 1.92% scope specified in the contract and environmental factors). EAC
Project M-12 0.26694 0.24910 94.82% 84.18% 2.53% factors of influence deduced from reference material are represented
Project M-13 0.23630 0.22980 93.34% 84.18% 0.92%
as input and prospective project cost is the output. The cost database
Project M-14 0.18320 0.20250 89.97% 84.18% 2.72%
Project M-15 0.17857 0.19580 90.24% 84.18% 2.43%
of project cases was established accordingly. Database records were
Project M-16 0.15536 0.14510 93.83% 84.18% 1.46% used to plan values and actual values for each month and calculate the
Project M-17 0.08462 0.08330 82.66% 84.18% 0.18% difference between the two. The mapping relationship between input
Average error 1.68% and output was found via case learning and ESIM training. Finally, the
prospective cost of a new project was predicted by inputting the
monthly cost of a project into the developed system in accordance
Training data set: Before executing the prediction model, patterns with training and testing results. The process is described in Fig. 2.
of influence must first be found and as training data into the system as EACP is defined as Eq. (1) in this research.
prediction input parameters.
SVM training model: In this step, the user collects relevant EACP = ACP + ETCP ð1Þ
historical cases for research. Case influence patterns serve as input
where
parameters and the case decision serves as output parameters. Such
input and output values became the training data set, and are input
EACP (estimate at completion percentage) predicting EAC in ad-
into the model as initial training data. SVM regards selected C and γ
vance/total cost of original budget
values as default patterns for the first training process.
ACP (actual cost percentage) actual cost at some specific moment/
Average accuracy: This step regards the reciprocal of the objective
total cost of original budget
function as the fitness function. A larger value correlates to a superior
model framework.
Termination criteria: The procedure operates continuously until
certain conditions are satisfied, e.g., confirmation of appropriate fitness
Table 7
or absence of conspicuous fitness after making calculations in several
General EAC prediction methods of EVM.
generations to demonstrate that convergence has already been reached.
Item Equation Note

EAC1 = AC + BAC − EV
EAC2 = BAC / CPI
Table 6 EAC3 = BAC / SPI
Validation of training cases qualification of testing cases. EAC4 = AC + (BAC − EV) / CPI
EAC5 = AC + (BAC − EV) / SPI
Project Number of ESIM Average error of Qualified
EAC6 = AC + (BAC − EV) / SCI SCI = CPI * SPI
name periods RMS EACP (b10%)
EAC7 = AC + (BAC − EV) / (W1 * CPI + W2 * SPI) W1 = 0.8
L 20 0.0594 7.02% Yes W2 = 0.2
M 17 0.0173 1.68% Yes EAC8 Execution budget amount Approved budget
624 M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629

Table 8
Error of EVM and ESIM prediction results.

Project name EAC1 EAC2 EAC3 EAC4 EAC5 EAC6 EAC7 EAC8 ESIM prediction Note

A 10.0% 17.1% 6.9% 16.8% 9.7% 15.9% 15.7% 5.2% 8.7%


B 7.0% 9.1% 10.6% 9.1% 7.3% 9.4% 8.4% 7.6% 5.6%
C 27.0% 3.2% 15.9% 3.2% 7.3% 3.2% 3.3% 15.9% 4.1%
D 9.1% 6.5% 23.9% 7.1% 9.1% 7.1% 5.9% 24.8% 5.6%
E 9.7% 35.0% 5.6% 13.7% 9.7% 13.7% 13.0% 4.3% 6.2%
F 5.0% 14.0% 12.9% 12.3% 11.6% 14.4% 11.4% 1.7% 7.6%
G 3.5% 14.1% 17.5% 13.7% 14.8% 40.0% 13.3% 6.1% 3.6%
H 13.6% 8.9% 35.4% 7.5% 13.7% 7.5% 7.3% 32.4% 14.0%
I 8.7% 23.1% 19.4% 23.1% 7.6% 21.7% 20.7% 14.2% 5.0%
J 3.2% 11.4% 7.2% 9.2% 5.4% 9.9% 8.5% 2.1% 4.2%
K 3.3% 6.7% 8.6% 7.4% 3.9% 7.0% 6.1% 6.8% 3.7%
Average 9.1% 13.6% 14.9% 11.2% 9.1% 13.6% 10.3% 11.0% 6.2%
Qualified percentage 81.8% 45.5% 36.4% 54.5% 72.7% 54.5% 54.5% 63.6% 90.9%
L 7.6% 21.0% 23.5% 19.4% 12.1% 19.3% 18.7% 18.0% 7.0% Testing case
M 4.6% 16.2% 12.2% 16.0% 13.1% 21.7% 15.5% 9.3% 1.7% Testing case
Average 6.1% 18.6% 17.9% 17.7% 12.6% 20.5% 17.1% 13.7% 4.4%
Qualified percentage 100% 0% 0% 0% 0% 0% 0% 50% 100%

ETCP (estimate to completion percentage) estimate to completion at 3.1. Identify Significant Factors of Influence
some specific moment/total cost of original budget, also the
output value of ESIM This research identified factors that significantly affect the EAC of
construction projects using several relevant publications (listed in
The processes of developing the model include “Identify Significant Table 1). Input parameters of the model were obtained after time-
Factors of Influence”, “Historical Data Collection”, “Model Training” and dependent cost factors and performance management concepts were
“Model Testing”. The details of each process are illustrated accordingly. also considered.

Fig. 6. Processes of applying EAC–ESIM for cost management.


M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629 625

Fig. 7. Cost exception management based on prediction results.

Due to the variety of construction project categories and resultant input values in accordance with definition equations (see Table 3).
differences in data inputs, this research focused only on construction The 13 historical cases were divided into training and testing groups.
projects done in reinforced concrete in order to control for potential The 11 cases in the training group comprised 269 periods and the 2
variance in results attributable to construction type. Ten significant cases in the testing group comprised 37 periods.
influence factors for EAC were calculated as input values based on Output parameters were normalized by linear scaling. The normali-
monthly construction cost data. Prospective cost was designated as zation method was revised to keep estimated cost in the range of 0 to 1, so
the output value. that maximum and minimum of output parameters were enlarged by
10%. The detail processes are shown in Eqs. (2)–(4).
3.2. Historical Data Collection
Xa −XL
Xn = ð2Þ
The 13 historical cases included in this research were all reinforced XU −XL
concrete projects executed between 2000 and 2007 by one engineer-  
ing company located in Taipei City. Projects were located chiefly in XU = Xmax + Xrange × 10% ð3Þ
northern Taiwan, with selection criteria considering data distribution
 
and completeness. Building height ranged from 9 to 17 stories XL = Xmin − Xrange × 10% ð4Þ
(inclusive of stories belowground). The value of contracts ranged from
NT$80 million to NT$1.1 billion. Overall floor area of cases studied
where
ranged from 3094 m2 to 31,797 m2. Construction durations ranged
from 457 to 749 days. Relevant data from historical cases used are
Xn output parameter after normalization, the range is between
arranged in Table 2.
0 and 1
Monthly cost records for every case were collected and the ten
Xa output parameter before normalization
identified factors of significant influence on EAC were calculated as
XU upper bound of the output parameter
XL lower bound of the output parameter
Xmax maximum of the output parameter
Table 9
Cost influencing indices. Xmin minimum of the output parameter
Xrange difference between maximum and minimum
Index name Definition Criterion

Cost performance Earned value/actual cost ≧ 1.05


index
Schedule Earned value/ ≧1
performance index planned value
Subcontractor Billed amount/ ≧ 1, ≦1.1 Table 10
billed index actual cost Monthly cost information.
Owner billed Billed amount/ ≧ 0.9
Item Definition
index earned value
Change order Revised contract ≦ 1.1, Estimate at completion percentage Predicted EAC of ESIM/original budget
index amount/budget at completion ≧ 0.9 (EACP) amount
Construction Construction material price index of that Progress b 30%, Actual cost percentage (ACP) Actual paid of cost/original budget amount
cost index month/construction material price index ≦ 1.02 Earned value percentage (EVP) Budget cost at completion/original budget
of initial stage Progress b 60%, amount
≦ 1.03 Planned value percentage (PVP) Budget cost of planning/original budget
Climate effect (Revised project duration ≧ 1 − 0.2 × amount
index — no. of rainy day)/revised progress Contract billed percentage (CBP) Proprietor pricing amount/original
project duration percentage contracted amount
626 M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629

Table 11
Comparison of cost variance.

Definition Illustration

Budget Revised budget % (RBP) — estimate at completion % (EACP) N0, cash balance of budget
b0, not enough in budget, exception management
Schedule Earned value % (EVP) — planned value % (PVP) N0, exceed in progress
b0, delay in progress, overtime planning
Cost Earned value % (EVP) — actual cost % (ACP) N0, cash balance of contract, not pricing by subcontractor
b0, overspend of contract, missed list, not transacting budget increase
Contract billed Contract billed % (CBP) — actual cost % (ACP) N0, cash balance of amount, large quantities of payment, subcontractor not valuating
b0, contract increased or constructed without valuation, contract decreased and not
revised, over amount

3.3. Model Training 3.4.1. Testing of training error


The 269 periods collected from 11 cases were input into the
Training of the ESIM training module began after parameter selection. database. An estimated value of prospective cost percentage for each
A total of 100 generations were searched for a period of 1.0167 min. The case could be obtained via the ESIM performance assessment module.
fault-tolerant parameter C was 20. The kernel function parameter γ The estimated value is termed ‘Predicted output’. The actual value of
equalled 0.1. The Root Mean Square (RMS) which describes the quality of the prospective cost percentage for each case is termed ‘Actual
how well the model fits the data of the model obtained from the optimum output’. Fig. 3 shows the relationship between Predicted output and
chromosome equalled 0.0559. Actual output produced by ESIM training. EACP (see Eq. (2)) could be
obtained after de-normalizing the Predicted output and the training
3.4. Model Testing error percentage could be calculated by making a comparison with the
corresponding percentage of actual cost at completion.
The ESIM performance assessment module was employed after During the constructing process, managers are most concerned
completion of training. ESIM decoded the optimum chromosome as the with the development trend and the influence of such on project
EAC prediction model to facilitate Model Testing revelation of prediction decision making. To determine model accuracy after training and
error and learning accuracy. Model Testing included testing of training achieve the practical needs of managers, this research selected a 10%
error and case verification. average error as the qualification ceiling. The qualified rate was 91%
Model regulation was learned by training historical case data. The for the 11 training cases in this research, as shown in Table 4.
model was established utilizing ESIM to search for training cases that
possessed consistency between inference output and actual output. After 3.4.2. Verification of testing cases
training, model accuracy was tested by comparing differences between A total of 269 periods collected from 11 cases were input into the
output results and actual values. Estimating criteria used in this research database. An estimated value of prospective cost percentage for each case
are shown in Eq. (5). The model was qualified when the error fulfilled could be obtained using the ESIM performance assessment module. As
selected requirements. above, the estimated value is termed ‘Predicted output’, while the actual
ESIM value of prospective cost percentage for each case is termed ‘Actual
jEAC Pi −AEAC P j output’. Fig. 4 shows the relationship between Predicted output and
ei = × 100% ð5Þ
AEAC P Actual output produced by ESIM training. After de-normalizing the
where Predicted output, EACP (see Eq. (2)) could be obtained and the percentage
of the training error could be calculated through a comparison with the
ei error percentage of ith period predicted by ESIM percentage of actual cost at completion.
EACESIM
Pi estimate at completion percentage of ith period predicted by The main purpose of verification is to examine whether the model
ESIM trained by ESIM can be employed to infer or predict cases beyond
AEACP actual estimate at completion percentage of the case training cases. Data for two testing cases (Case L and M) were input

Fig. 8. Time series of the cost influencing indices for case L.


M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629 627

Fig. 9. Cost management diagram of case L.

into the ESIM performance assessment module, with results shown in qualified percentage for each prediction methods. Results of compar-
Table 5. Comparing the predicted values of EAC with actual values, ison are shown in Table 8 and discussed below:
average errors were 7.02% and 1.68%, respectively. Both results were
considered ‘qualified’ under the definition set above (Table 6). 1. No single EVM method predicts EAC at a consistently high level of
Figs. 4 and 5 were drawn in order to perform further analysis for accuracy. Prediction results varied with differences in project
case L, which returned a comparatively large error. From the 11th details. For the cases studied in this research, EAC1 and EAC5
period to 15th period, the error values between 14.45% and 19.98% attained better average error and qualified rates.
were noticeably large. A possible reason could be due to a rapid 2. ESIM predictions showed a larger difference of accuracy during the
revision of project budget during this period. Such would make data initial 30% of project work schedule. This may be attributable to
unstable and cause larger error values. data instability in the initial stages and the influence of design
The revisions and updates of the project budget are not rare in changes. However, the prediction result attained by ESIM is more
practice for construction projects with or without change order. In precise than by EVMs.
this study, the historical cases for Model Training implicitly included 3. The predictive error of ESIM is comparatively steady. Qualified
the uncertainty of change order for replan. This is the reason of the rates of training and testing cases were 91% and 100%, respectively.
study to use AI to solve the uncertainty. To identify the impacts of Both rates were larger than those of EVMs. Such proves the
change order on EAC, a real case with significant change order, i.e. feasibility of EAC predictions using ESIM.
Project L, was selected for Model Testing. The results showed that the
predicting error would increase in a period of time after change order 5. Applications of EAC–ESIM to construction management
(but the error was still tolerable). Overall, it would converge to the
actual result in the end. 5.1. Applying EAC–ESIM for cost management

After the EAC prediction model using ESIM was established, the
4. Comparison of EAC–ESIM prediction with 8 EVM methods feasibility of the model needed to be proven using actual projects. A
prediction could be made once data had been collected and formatted to
To demonstrate that the method of EAC prediction model model requirements.
proposed in this research is feasible and reliable, EAC values used As the prediction model must address real project aspects, the
were calculated using eight other EVM methods [16]. The values are procedure used to apply the EAC prediction model was designed as shown
compared with the predictions generated by the ESIM (Table 7) to in Fig. 6, in accordance with various situations of every construction
assess comparative accuracy. This research selected 10% of error to be stages.
the qualified criterion, and calculated the average error and the
5.2. Cost exception management based on prediction results

An effective prediction model establishes a response system able to


Table 12 identify factors that significantly influence the EAC at different project
Cost influencing indices analysis of case L.
stages. The predicted trend acquired from the ESIM can then be compared
Index name Value Illustration with project cost influencing indices and project cost diagrams. Finally,
Cost performance index 1.17 ≧1.05, well cost management, revised propositions for deviations may be addressed and tracked
but the ratio is slightly high. continuously in order to effectively control costs.
Schedule performance index 0.96 ≦1, slightly delay in progress, The EAC prediction provides a key reference to construction
and the trend is increasing.
managers by analyzing the cost information over the course of the
Subcontractor billed index 1.05 ≧1, ≦ 1.1, normal.
Owner billed index 0.90 ≧0.9, normal. project. Furthermore, the managers may assess, term by term, the
Change order index 1.03 ≦1.1, normal. factors of influence over cost and consider the various potential
Construction cost index 1.09 ≧1.03, slightly high, but the main reasons that might result in cost overruns when an EAC prediction
decoration engineering items are contracted. exceeds the approved budget, allowing managers to identify potential
Climate effect index 0.83 ≦0.86, normal.
problems and control costs.
628 M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629

Table 13
Cost diagrams analysis of case L.

Item RBP EACP ACP EVP PVP CBP

Value 104% 97% 57% 66% 69% 60%


Analysis Difference Illustration

RBP N EACP +7% Possible to decrease budget.


EVP b PVP − 3% Delay in progress, and the trend is increasing.
EVP N ACP +9% Cash balance of contract or not pricing by subcontractor.
CBP N ACP +3% Cash balance of amount, but the ratio reducing, indicates
that the proprietor adds items without pricing.

Reasons 1. Addendum of structural engineering revised in the eleventh month with an evaluated delay of three months.
2. Current contract cash balance: 5%, but not transacts the cash balance decrease while the budget increases in the
eleventh month.
3. Items not yet priced by the subcontractor: approximately 2%.
4. Fits with estimation results.

Training and convergence testing demonstrated model fitness for enforced in accordance with project characteristics. Furthermore,
use. This research uses the Variance at Completion percentage (VACP, comprehensive inspection and strategy direction may be provided.
defined as Eq. (6)) as the criterion for exception management. The
application procedure of EAC management is illustrated with Fig. 7. 5.2.3. Application of an actual example case
Example Case L: Time series for cost influencing indices and cost
VACP = RBP  EACP ð6Þ management are shown in Figs. 8 and 9, respectively. Moreover, this
research selects the time point at which project progress reaches
71.16% (Project-L-12) to analyze indices of project cost (see Table 12)
where and project cost diagrams (Table 13 and Fig. 10). Prediction results
were verified by comparing analysis results with the project
RBP Revised Budget Percentage completion report, leading us to conclude that results are feasible
when applied to the effective management and control of project
costs.
5.2.1. Project cost influencing indices analysis
Indices can monitor and identify project deviations effectively and 6. Conclusion
rapidly. However, factors that influence project cost are complicated
and the definition and criteria of indices are indeterminate in research This research proposed an EAC prediction method using ESIM that
done in relevant fields (see a list of literature reviewed in Table 9). employs fmGA and SVM. Results obtained in this paper are
summarized as follows:
5.2.2. Project cost diagrams analysis
Indices that influence project cost can be provided to managers to 1. Research established an EAC predication method utilizing ESIM
help their control of prospective costs and investigate pre-emptively that identifies significant factors of influence on project cost and
potential cost management problems. Nevertheless, overall construc- performs predictions by collecting and arranging experience-based
tion cost trends cannot be determined by analyzing project cost rules from historical cases. ESIM results represent a significant
indices alone as cost problems may display abnormalities in some improvement over results obtainable using traditional EAC
situations. Thus, project cost diagram analysis is used as a supple- prediction methods.
mentary method in this research. 2. The EAC prediction model established in this research is consid-
This research used original contract documents to define the scope erably accurate. The only inputs that need to be entered into the
of comparative data. Project cost diagrams were drawn on a monthly model are the key factors influencing costs during the current
basis based on EAC prediction results and data items shown in month following training and testing. A significant disadvantage
Table 10. Project schedule tendencies and costs may be obtained by for traditional construction projects, i.e., cost tendencies cannot be
comparing and analyzing variations in cost information (Table 11). predicted in real time, has been effectively remedied.
Therefore, appropriate prewarnings may be heeded and operations 3. ESIM prediction results were compared with eight common EVM
prediction methods. Values obtained by the EVM methods were
relatively unstable due to the wide variance in data among
projects. Conversely, the model developed in this research
generated comparatively steady prediction values. Such verified
the feasibility of using ESIM to predict construction project EACs.
4. Prediction results were analyzed further using project cost
influencing indices and project cost diagrams. This helped identify
the causes underlying EAC differences and trends. Results help
managers to control project costs in real time.

References
[1] D. Bolles, S. Fahrenkrog (Eds.), A Guide to the Project Management Body of
Knowledge (PMBOK Guide), 3rd Ed, Project Management Institute, Newtown
Square, 2004.
[2] G. Clifford, E. Larson, Project Management: the Complete Guide for Every
Fig. 10. Cost diagram of case L. Manager, McGraw-Hill, New York, 2002.
M.-Y. Cheng et al. / Automation in Construction 19 (2010) 619–629 629

[3] G.A. Barraza, W.E. Back, F. Mata, Probabilistic forecasting of project performance [14] S.S. Leu, Y.C. Lin, Project performance evaluation based on statistical process
using stochastic S curves, Journal of Construction Engineering and Management control techniques, Journal of Construction Engineering and Management 134
130 (1) (2004) 25–32. (10) (2008) 813–819.
[4] D.E. Lee, Probability of project completion using stochastic project scheduling [15] H.L. Stephenson, Identifying risks and opportunities using EAC, Proc. 48th AACE
simulation, Journal of Construction Engineering and Management 131 (3) (2005) International Annual Meeting '04, AACE International Transactions, 2004,
310–318. pp. CSC.06.1–CSC.06.9.
[5] D.E. Lee, D. Arditi, Automated statistical analysis in stochastic project scheduling [16] S. Alexander, Earned Value Management Systems (EVMS): Basic Concepts, Project
simulation, Journal of Construction Engineering and Management 132 (3) (2006) Management Institute, Washington DC, 2002.
268–277. [17] V.N. Vapnic, The nature of statistical learning theory, Springer, New York, 1995.
[6] B.C. Kim, K.F. Reinschmidt, Probabilistic forecasting of project duration using [18] C.W. Hsu, C.J. Lin, A simple decomposition method for support vector machine,
Bayesian inference and the beta distribution, Journal of Construction Engineering Machine Learning 46 (1–3) (2002) 219–314.
and Management 135 (3) (2009) 178–186. [19] C.F. Lin, Fuzzy support vector machines, Ph.D. Thesis, Department of Electrical
[7] E.H. Kim, W.G. Wells Jr., M.R. Duffey, A model for effective implementation of Engineering, National Taiwan University, 2004.
earned value management methodology, International Journal of Project [20] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, accurate optimization of
Management 21 (5) (2003) 375–382. difficult problems using fast messy genetic algorithms, Proc. 5th International
[8] D.F. Cioffi, Designing project management: a scientific notation and an improved Conference on Genetic Algorithms '93, Morgan Kaufmann Pub. Inc, San Mateo,
formalism for earned value calculations, International Journal of Project 1993, pp. 56–64.
Management 24 (2) (2006) 136–144. [21] R. Day, J. Zydallis, G. Lamont, Competitive template analysis of the fast messy
[9] S. Vandevoorde, M. Vanhoucke, A comparison of different project duration genetic algorithm when applied to the protein structure prediction problem, Proc.
forecasting methods using earned value metrics, International Journal of Project 2nd ICCN '02, Computational Publications, Cambridge, 2002, pp. 36–39.
Management 24 (4) (2006) 289–302. [22] C.W. Feng, H.T. Wu, Integrating fmGA and CYCLONE to optimize the schedule of
[10] G. Vitner, S. Rozenes, S. Spraggett, Using data envelope analysis to compare dispatching RMC trucks, Automation in Construction 15 (2) (2006) 186–199.
project efficiency in a multi-project environment, International Journal of Project [23] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector
Management 24 (4) (2006) 323–329. regression machines, Proc. 10th Annual Conference on NIPS '96, Advances in
[11] M. Vanhoucke, S. Vandevoorde, A simulation and evaluation of earned value Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, 1997,
metrics to forecast the project duration, Journal of the Operational Research pp. 155–161.
Society 58 (10) (2007) 1361–1374. [24] C.C. Chang, C.J. Lin, Training nu-support vector classifiers: theory and algorithms,
[12] W. Lipke, O. Zwikael, K. Henderson, F. Anbari, Prediction of project outcome: the Neural Computation 13 (9) (2001) 2119–2147.
application of statistical methods to earned value management and earned [25] D. Knjazew, G.A. Ome, A Competent Genetic Algorithm for Solving Permutation
schedule performance indices, International Journal of Project Management 27 and Scheduling Problems, Kluwer Academic Publishers, Boston, 2003.
(4) (2009) 400–407.
[13] M. Plaza, O. Turetken, A model-based DSS for integrating the impact of learning in
project control, Decision Support Systems 47 (4) (2009) 488–499.

You might also like