Professional Documents
Culture Documents
ABSTRACT: Estimation of the cost of a construction project is an important task in the management of con-
struction projects. The quality of construction management depends on accurate estimation of the construction
cost. Highway construction costs are very noisy and the noise is the result of many unpredictable factors. In
this paper, a regularization neural network is formulated and a neural network architecture is presented for
estimation of the cost of construction projects. The model is applied to estimate the cost of reinforced-concrete
pavements as an example. The new computational model is based on a solid mathematical foundation making
the cost estimation consistently more reliable and predictable. Further, the result of estimation from the regu-
larization neural network depends only on the training examples. It does not depend on the architecture of the
neural network, the leaming parameters, and the number of iterations required for training the system. Moreover,
Downloaded from ascelibrary.org by University of Leeds on 04/30/13. Copyright ASCE. For personal use only; all rights reserved.
the problem of noise in the data is taken into account in a rational manner.
where X; = (x~, x~, ... , x;) = ith example with p input attributes Ec(F) =~ IlpFW (5)
(x~ is the nth attribute of the ith example); and d; = corre-
sponding example output. The approximation mapping func- where the symbol IIgll denotes the norm of function g(x) de-
tion is denoted by F(x). fined as
What is the best fit? This is an important question. Because
of the noise in the data examples, a perfect fit, that is when (6)
F(x;) = d i , usually is not the best fit. In this case, the approx-
imation function is often very curvy, with numerous steep
peaks and valleys that lead to poor generalization. This is the and P = a linear-differential operator defined as (Poggio and
overfitting problem mentioned earlier. 1\\10 other fitting situ- Girosi 1990; AI-Gwaiz 1992)
ations can also be recognized: underfitting with oversmooth K
surfaces resulting in poor generalization and proper fitting.
Only the last type of fitting can lead to accurate generalization
IIPFI1
2
= 2: MD F(x)11 k 2
(7)
k-O
IID FW =
k
2:
lal-k
iRP
[oaF(x)]2 dx\ dx2 '" dxp (8)
of two steps. In the first step, the value of the parameter <T in
(11) is found by a cross-validation procedure that is described
later. The smoothness of the approximation function is pri-
marily controlled by this parameter. The smaller the value of
The multiindex a = (al> a2, ... , a p ) = a sequence of nonneg- a, the smoother the approximation function. We will call a
ative integers whose order is defined as lal = ~f=1 al' In (8), the smoothing parameter. In the second step, WI is found using
the partial differential term inside the bracket is defined as the method described in next section.
olaIF(x) DETERMINATION OF THE REGULARIZATION
o°F(x) = OX\'OX2" ..• OXpP
a <L a (9)
NETWORK WEIGHTS
Therefore, the regularization term is The smoothing parameter <T and the weights WI depend on
This function is simply the summation of integrations of the d = [d lt d 2 , ••• , dNt (12)
partial derivatives of the approximation function squared. As
G(Xl;Xl) G(Xt;X2) G(X\;XN)]
such, the regularization term is small when the function is
smooth because the derivatives tend to be small and vice versa. G = G(xt X\) G(xt X2 ) G(X~;XN) (13)
For bk = 132klk!2k and K approaching infinity, where ~ is a [
G(XN;X\) G(XN;X2) G(XN;XN)
positive real number, it can be proved that by minimizing the
error function, (3), with respect to the approximation function, w = [Wit W2, ... , WNt (14)
the solution of the problem can be written in the following
form (Poggio and Girosi 1990): where G(Xi;Xj) = exp[ -allx; - xjln it can be shown that the
solution of the regularization problem, Le., the weight WI' sat-
F(x) = ±
,,..1
Wi exp (- 2~211x -
tJ
XiW) = ±i-I
Wi exp(-ullx - XiW)
(11)
isfies the following equation (Haykin 1994):
(G + I)w = d (15)
~ -~[~].y W
w- LJ (17)
i-I c;
where U(i), i = 1, ... , N = ith column of U; and y(O' i = 1,
... , N = ith column of Y. In (17), the summation is performed
Xp
Hidden Loyer of
over J terms and not over all the N terms in order to avoid
Input Loyer Gaussian Functions OJtput Loyer numerical ill-conditioning due to division by very small num-
./ "- bers and the truncation error. The CI terms in the denominator
/ ~
of (17) cannot take very small values. All of the singular val-
[(J] -croSs-\.§'~~~ . --~ ues used in (17) are greater than E = NEmlc\l, where Em is the
FIG. 2. Architecture of Regularization Network for Construc- machine precision, and c\ is the largest singular value. By
tion Cost Estimation selecting a predetermined value for Em the small values of Ci
20/ JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT / JANUARY/FEBRUARY 1998
In the cross-validation method, the available set of examples where i7, n = 1. 2•...• N = nonnalized input data, and
Downloaded from ascelibrary.org by University of Leeds on 04/30/13. Copyright ASCE. For personal use only; all rights reserved.
For each value of CJ', the Wi weights are found using (16) are the means and standard deviations of the original set of
and (17), and an average training error is calculated in the variables.
following fonn: The Gaussian activation function is maximum at its center
N,
(data point) and approaches zero at large distances from the
E,= 2:
.-1 [d7 - F(x7)]2/N, (l8a) center. In other words, statistically speaking, use of the Gauss-
ian activation function amounts to a large output near the
center (data point) and zero output at large distances from the
Next, using the validation set (x:, d~). n = 1, 2, ...• N v , an center where there is no data point. But the lack of data point
average validation error is calculated in the following fonn: does not necessarily mean the output is zero at large distances
from the available sample data points.
N.
One may argue that it is not possible to make an accurate
2:
.-\ [d: - F(x:mNv (I8h) estimate at large distances from the example data points. This
is the well-known extrapolation problem. While the regulari-
Typical trend relationships between the average training and zation theory solves the interpolation problem accurately, it is
validation errors and the smoothing factor CJ' are shown in Fig. not concerned with the extrapolation problem. But a practical
3. The average training error always decreases with an increase estimation system should not fail abruptly at the boundaries
in the magnitude of CJ' for a numerically stable algorithm. In of the available data domain. Consequently, to improve the
contrast. the average validation error curve does not have a estimation accuracy at large distances from the available data
continuously decreasing trend. Rather. one can identify a min- points. first a linear trend (hyperplane) is found through the
imum on this curve. As mentioned earlier, broadly speaking a example data points by perfonning a linear regression analysis.
large CJ' indicates overfitting and a small CJ' indicates underfit- Next. the output data are nonnalized with respect to this hy-
ting. The CJ' corresponding to the global minimum point on the perplane (the outputs are measured from this plane instead of
validation curve represents the properly fitted estimation curve. a zero base hyperplane). Finally, a regularization network is
The average validation error gives an estimate of the estima- applied using the nonnalized data output. This process will
tion/prediction error. bring the estimates at large distances from the available data
points close to the linear trend hyperplane.
Mathematically, the function 'L~=I(d7 - 'Lf=1 a ii7 - ao)2 is
minimized with respect to linear parameters al (i = O. 1,
... , p) in order to find the linear trend hyperplane. This hy-
perplane is represented by
p
Y= 2: aixi + ao
i-I
(22)
1:
.!! 40 00 o ll''''llACO R0f:i. ~ °t 0 o eralization. The corresponding average training and validation
U) 0 OO~O 4" ~ ca: 121"00 49
! ~ qg ~o "'JID0 errors for the unit cost of the concrete pavement are $6,45/m3
9'
OJ'. 00 aDam
00' p.ooe'~8~6".~0~.F ••
~30 o 0 0 0 0 ° 00 0 f9
o 0
and $7.22/m3, respectively. For comparison, the average unit
cost of the concrete pavement for the 242 example data is
:5'" 20
$39.2/m3 • Fig. 6 shows the learned curve along with the train-
10
ing and validation data sets.
oL- ~ ~ ~ ~ __l
o
go 7.50 I
I
i 0.8
~ 7.00
g
.
II 6.50 --------------------- -------
~
~ 6.00
5.50
10
5.00 L _
~ ~ ~
-
X
"C" 8.00
o Tralnln Set Dala P~I~~ r • Mlnlmlum Validation Error Point
"C" 60 x
OX j 7.50
j ~
~ 50 G 7.00
G j
1:
~ 40
! 6.50 0.05
:2- g6.00
§ 30
o x
.
w
~ 5.50
'":::>c 20 ~
i( 5.00
-'-
10
4.50
oL-._ _ ~ ~ ~ __ ~ ~ _ ______J 4.00 L. --'- -'- ~ _'_ _____J
1 10 100 1000 10000 100000 100000 0.001 0.010 0.100 1.000 10.000 100.000
Quantity (cubic meter) n
FIG. 6. Proper Generalized Learned Curve and TralnlnglVall- FIG. 8. Average Training and Validation Errors for Different
dation Data Set Values of a Using Quantity and Dimension Information