Regression Analysis For Cost Modelling

WELCOME
Regression Cost Model

Introduction
Regression analysis A statistical method by which estimates are made of the value of a variable from a knowledge of the values of one or more other variables, and the errors involved in this estimating process measured Normally used in situations where relationships between variables is unique Main types Simple Linear Regression Analysis Multiple Linear Regression Analysis
the not
Assumptions
The standard deviation in the error associated with the dependent variable cost remains constant throughout the domain This error is normally distributed The effect of any variable is always expressed in terms of a fixed cost increase or decrease, irrespective of project size or type
Simple Linear Regression Analysis
Two-variable linear regression describes the relationship between two variables by computing a straight line through the data obtained
Dependent
variable (y) - the value to be estimated
Independent
variable (x) the factor from which the estimates are made (a)- the value of y when the independent variable is zero coefficient of x (b)- The slope of the line for straight line
Constant
The
Expression
y=a+bx
Dependentvariable
b=tan
a
Independentvariable
Predictionwithinthe range of values in the dataset is known asinterpolation Predictionoutsidethis range of the data is known asextrapolation
Steps of SLR Model
Steps of SLR Model
Specification
on the relationship
Begins with theoretical reasoning between variables
Form equations to represent the relationships between variables Since the population parameters are unknown, sample is considered and the model is built with estimated values
Estimation
Lean squares estimation procedure is used most of the time Include a series of statistical tests to make sure that the estimated model is a good representation of the postulated relationship
Steps of SLR Model
Validation
Evaluate the quality of the model Evaluated on the basis of following statistics
o Coefficient of determination o Standard error o F ratio test- The ratio of the regression mean square to the residual mean square o T ratio test- The ratio of the coefficient to its standard error
Forecasting
Forecasting should be satisfactory to the users Accuracy depends on the acceptable error amount of the model
Multiple Linear Regression Analysis
This aims to create a relationship with the dependant variable with several other independent variables.
Independent / Response variable - y Dependant/ Explanatory variables- x1, x2, x3.. xn
y=a+b1x1+b2x2+b3x3+bnxn+e
Steps of MLR Model
STEPSOFMLRMODEL
Specification Begins with theoretical reasoning on the relationship between variables Selecting a full set of explanatory(Independent) variables
Estimation
Determining the correlation coefficients between all possible pairs. Resolving multicollinearity Eliminating non-significant variables one at a time until all the remaining variables are significant.
o Use of the t-ratio- a large t-ratio is desirable. o Use of the F-ratio -Test for the significance of the overall dependence of y on the variables (x1, x2, . . . , xn )
STEPSOFMLRMODEL
Estimation Cont Constructing a multiple linear regression model
o Making estimates of the coefficients in the regression model, the method of least squares is used due to its simplicity.

Validation
Validation done before practical use in construction industry using another actual project. Forecasting If validation is a success practical use on construction projects
Application in Construction Industry
Simple Linear Regression Analysis
Case Study: Consider the possible sample values of bricklayer hours and areas of brickwork from 10 fictitious contracts in the following table

Plo tte dsc atte rdiagram
Scatter is caused by the factors other than area which affect the hours required
o Bricklayer-hours : Independent / Response variable o Areas of brickwork : Dependent / Regression variable
To avoid individual judgement in constructing the line method of least squares used In fitting the regression line to a set of data, several parameters are estimated which need to be tested for the significance before being accepted As an overall guide to the strength of association between the two variables the correlation coefficient is calculated
Perfect correlation = 1
Calculated coefficient correlation = 0.998
Shows an excellent degree of correlation which cannot be found by using one variable only Standard error of estimate ; anticipated difference between the actual values and what the regression line predicts, should be calculated
Application in Construction Industry
Multiple Linear Regression Analysis

Case Study: Obtaining a model that will estimate productivity rates of concrete operations. For this a wastewater project was observed in the North-East of Scotland (Project A). The regression analysis methodology used in this study is backward elimination, stepwise regression.
Firstly listing out the independent variables of Project A

o o o o o
Type of pour Total volume(m3) Number of trucks on job Average volume of load(m3) Start time
Ave rage truc k time (minute s) o Numbe ro flo ads o We athe r o Co nc re te mix
o
cyc le
Calculating
the correlation coefficients between all possible pairsbyusingtheinbuiltfunctionsofMicrosoftExcel
Resolving
multicollinearity by removing one variable (Total volume) out of the highly corerated two variables ( i.e Total volumeandNo.ofLoads).
Estimating partial regression coefficients and the corresponding t-statistics from the regression on actual productivity for all eight explanatory variables. Insignificant variables have small absolute values-Should be eliminated Carrying out two further runs, eliminating the insignificant variables: concrete mix (t-statistic=0.97) and the start time (t-statistic=1.72) from the regression model.
An important assumption made is the variability of the data does not change for different levels of the response or explanatory variables.
o This is checked by carrying out residual plots.
Constructing a multiple linear regression model for actual productivity for a single server concrete system.

Pactual =1.31Tp+1.75Va+0.56Tn+0.59W0.01Ct0.37Ln6.95
Tp=Typeofpour Va=Averagevolumeofconcrete Tn=Numberoftrucksonjob W=Weather Ct=Averagecycletime Ln=Numberofloads
Validation is done by using an actual concrete pours from another wastewater project in Scotland by a different contractor (Project B). The actual productivities achieved on 32 operations observed on Project B are compared to the predicted productivities using the derived regression model
Drawbacks in regression
Multicollinearity
If the explanatory variables in multiple regression are correlated, and if the correlation coefficient (positive/ negative) is high it is difficult to get their separate effects on the dependent variable. Leads to a poorly estimated partial regression coefficient.
Omitted variables
If independent variables that have significant relationships with the dependent variable are left out of the model, the results will not be satisfactory. E.g location, quality etc cannot be quantified Biasness of selecting independent variables
Endogeneity
Changes in the dependent variable cause changes in the independent variable.
Development of Regression Model
Development of technologies in computing, accessing, processing and storing data

packages performleast
All major statistical software squaresregression analysis.
Simple linear and multiple regression using least squares can be done in somespreadsheet applications and on some calculators. Specialized regression software has been developed for use in fields (survey analysis, neuro imaging). TheConstructive Cost Model (COCOMO)- An example of an algorithmicsoftware cost estimation model developed using basic regression formula.
Conclusions
Regression Analysis falls under the Algorithmic Cost Model which uses mathematical formulae linking costs/inputs with metrics to produce an estimated output. It is used not only for estimating costs but also for forecasting productivity, time and any other parameter. A widely used method not just in the construction Industry. When there is only one major factor affecting the response SLR can be used When there are more than 1 major factor affecting the respone MLR can be used There are several drawbacks and limitations in this method. The knowledge of using Regression analysis in a specialized cost estimation software, in spread sheets and in a calculator is beneficial for the Quantity Surveyor

Regression Analysis For Cost Modelling

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Regression Analysis For Cost Modelling

Uploaded by

Copyright:

Available Formats

WELCOME

Regression Cost Model

Simple Linear Regression Analysis

variable (y) - the value to be estimated

Steps of SLR Model

Steps of SLR Model

Begins with theoretical reasoning between variables

Steps of SLR Model

Multiple Linear Regression Analysis

Independent / Response variable - y Dependant/ Explanatory variables- x1, x2, x3.. xn

Steps of MLR Model

Estimation Cont Constructing a multiple linear regression model

Application in Construction Industry

Simple Linear Regression Analysis

Plo tte dsc atte rdiagram

Calculated coefficient correlation = 0.998

Application in Construction Industry

Multiple Linear Regression Analysis

Firstly listing out the independent variables of Project A

the correlation coefficients between all possible pairsbyusingtheinbuiltfunctionsofMicrosoftExcel

Tp=Typeofpour Va=Averagevolumeofconcrete Tn=Numberoftrucksonjob W=Weather Ct=Averagecycletime Ln=Numberofloads

Changes in the dependent variable cause changes in the independent variable.

Development of Regression Model

Development of technologies in computing, accessing, processing and storing data

All major statistical software squaresregression analysis.

You might also like