Professional Documents
Culture Documents
BusinessResearchAssignment
FulltimeMBA2009Utrecht
Dateofsubmission:17thNovember2009
Wordcount:900words(excludingAppendix)
FTMBA09,UBNumber:09028224
BusinessResearchAssignment,FTMBA09,UBNumber:09028224
TableofContents
ExecutiveSummary.................................................................................................................................1
Introduction............................................................................................................................................1
Objective...........................................................................................................................................1
DataandMethodology.....................................................................................................................1
DataAnalysis...........................................................................................................................................1
LinearMultipleRegressionAnalysis.......................................................................................................2
Conclusion...............................................................................................................................................4
Recommendations..................................................................................................................................4
Appendix.................................................................................................................................................5
Tableofillustrations
Tables
Table1:DescriptivestatisticsofeachvariablefromdistrictAandB....................................................1
Table2:Correlationstable.....................................................................................................................2
Table3:MultipleRegressionAnalysisofPrice,H_Size,Age,District,H_DistandAge_Dist.................2
Charts
Chart1:BoxplotofpriceindistrictAandB..........................................................................................1
Chart2:Histogramofresidualvaluefromregressionmodel3.............................................................3
Chart3:Residualvalueplotagainstpredictedvaluefromregressionmodel3....................................3
Chart4:Scatterplot,Yaxis=Price,Xaxis=H_Size,Zaxis=Ageseparatedbydistrict.......................4
HousingPricePredictionModel
November17,2009
DataAnalysis
Executivessummary
This report has developed a reliable housing price
predictionmodeltoforecastthesellingpriceinDistrictA
andBbyusinglinearmultipleregressiontechnique.Our
modelcanexplain88.6%oftotalvariationinpricewithin
therelevantrangeofhousesizeandageofhouse.
Introduction
Objective
DataandMethodology
Several real estate agents and property assessors were
interviewed in order to identify what the major
explanatory variables are that might affect the price of
properties. The following independent variables were
considered:
Quantitative variables: H_Size (House size in square
feet),L_Size(Lotsizeinacres),Age(Houseageinyears),
Attract(Anattractivenessratingofthepropertyranging
from 0 to 100, the higher the better), P_Tax (Property
tax of the prior year in dollars), N_Rooms (Number of
bedroomsinthehouse)
Chart1:BoxplotofpriceindistrictAandB
ThemedianofpropertypriceindistrictBishigherthan
thatofindistrictAandtherearenooutliers.Thismeans
thatthepricedataarereliable.
DistrictA(District=0)
DistrictB(District=1)
RealEstateAssociation
Table1:Descriptivestatisticsofeachvariablefrom
districtAandB
HousingPricePredictionModel
November17,2009
LinearMultipleRegressionAnalysis
50215.092
94.322H_Size
1241.796Age
(0.000)(0.000)(0.000)
6.994H_Dist
Table2:Correlationstable
1087.526Age_Dist
(0.023)(0.001)
=0.886,Adjusted
=0.885
Std.ErroroftheEstimate=46669.901
RealEstateAssociation
HousingPricePredictionModel
November17,2009
Model2:
Prce
47888.970
95.212H_Size
1211.297Age
Lastly,residualpatternanalysisofmodel3showsthere
(0.000)(0.000)(0.000)
4.856H_Dist
1029.883Age_Dist
becauseAge_DistisinteractiontermofAgeandDistrict.
8885.618District
andindependenceoferrorsassumptions.(Seechart2,3)
(0.384)(0.004)(0.646)
=0.886,Adjusted
=0.885
Std.ErroroftheEstimate=46699.619
Model3:
Prce
42025.525
98.102H_Size
1212.041Age
(0.000)(0.000)(0.000)
1025.601Age_Dist
22962.084District
(0.004)(0.031)
=0.886,Adjusted
=0.885
Std.ErroroftheEstimate=46690.584
Chart3:Residualvalueplotagainstpredictedvaluefrom
regressionmodel3
RealEstateAssociation
HousingPricePredictionModel
November17,2009
Recommendations
Conclusion
UBNumber:09028224
Zaxis=Ageseparatedbydistrict
FulltimeMBA2009,
PredictionequationforDistrictA(District=0):
TiasNimbasBusinessSchool,Utrecht
42025.525
98.102 _
1212.041
TheNetherlands
PredictionequationforDistrictB(District=1):
64987.609
98.102 _
186.44
RealEstateAssociation
Appendix
1 Defineobjective
Todeveloparegressionmodelasatoolforpredictingthesellingpriceofresidentpropertiesinboth
districtsinthecity
2 Specifymodel
Usinglinearmultipleregressionmodel(1dependentvariableandmanyindependentvariables)
3 Collectdata
Thedataconsistsof625propertiessoldinthepast3monthsbothinDistrictAandDistrictB.
3.1 Dependentvariable
Price(HousesellingpriceinUSD)
3.2 Initialindependentvariables
Quantitative
H_Size(Housesizeinsquarefeet)
L_Size(Lotsizeinacres)
Age(Houseageinyears)
Attract (An attractiveness rating of the property ranging from 0 to 100, the higher the
better)
P_Tax(Propertytaxoftheprioryearindollars)
N_Rooms(Numberofbedroomsinthehouse)
Qualitative
District(Thedistrictinthecity:0fordistrictA,1fordistrictB)
Page|5
4 DescriptiveDataAnalysis
Figure1:BoxplotofPrice
There are no outliers data in Price. Median of price in District B is more expensive than that in
DistrictA
H_Size
L_Size
Age
P_Tax
N_Rooms
Attract
Figure2:Boxplotofallquantitativevariables
Therearenooutliersdatainanyofindependentvariables.H_Size,L_Size,AgeandP_Taxhavethe
samepatternofboxplot.
Page|6
DistrictA(District=0)
DistrictB(District=1)
Figure3:DescriptivestatisticsofeachvariablefromdistrictAandB
TheaverageofthehousingpriceindistrictB(USD453,980.94)ismoreexpensivethanthatindistrict
A(USD226,174.77).Consequently,averagepropertytaxindistrictBismoreexpensivethanthatin
districtA(USD5,300.90indistrictAandUSD1,655.65indistrictB).Additionally,averagehousesize
andlotsizeindistrictBare4,055.05squarefeetand1.4568acresrespectively,biggerthanthosein
districtA,whichare2,032.47squarefeetand0.6608acresrespectively.Theaverageageofahouse
indistrictBis47.28years,olderthanindistrictA,whichis12.57years.Averageattractivenessand
numberofbedroomsinbothdistrictsarenotsignificantlydifferent.
Page|7
Figure4:Correlationbetweeneachofvariablesinbothdistricts
Attract and N_rooms have no significant relationship to Price. However, H_Size, L_Size, Age and
P_Taxhaveasignificantrelationshipbetweeneachother.Thismeansthatthereismulticollinearity
betweenindependentvariables.
Page|8
5 EstimateunknownparameterandEvaluatemodel
WehaveoneindependentquantitativevariablethatisDistrict.Therefore,weaddDistrictasdummy
variableintolinearmultipleregressionmodel,created2interactionterms(H_Dist:H_Size*District,
Age_Dist: Age*District). Attract and N_Rooms are eliminated because they are no relationship to
Price(fromcorrelationanalysis).WedecidetonotaddingP_Taxbecauseitisnecessarytoknowthe
pricebeforewepaythetaxthatmeansitisnotsuitabletoadditinpricepredictionmodel.
Figure5:StatisticresultsfromSPSS
(
Page|9
R2=0.886,AdjustedR2=0.885andStandardErroroftheEstimate=46733.169
Ftest(Overalltest)
:
F=799.862,pvalue=0.000whichislessthan0.05(95%confidentinterval)
We reject null hypothesis. We are 95% confident that there are significantly linear relationship
betweenindependentvariablesanddependentvariable.
Ttest(Individualtest)
We fail to reject
0:
confidentthatthereisnosignificantlylinearrelationshipbetweenL_SizeandPrice.
Now,weeliminateL_Sizefromtheinitialmodel
Page|10
Figure6:StatisticresultsfromSPSS
R2=0.886,AdjustedR2=0.885andStandardErroroftheEstimate=46699.169
Ftest(Overalltest)
:
F=961.192,pvalue=0.000whichislessthan0.05(95%confidentinterval)
We reject null hypothesis. We are 95% confident that there are significantly linear relationship
betweenindependentvariablesanddependentvariable.
Ttest(Individualtest)
We fail to reject
0:
confidentthatthereisnosignificantlylinearrelationshipbetweenH_DistandPrice.
0:
keepitinourmodelbecauseofitisdummyvariable.Itisnotpracticaltostipulateandfitamodel
thatincludesinteractiontermsbuteliminatesthemaineffectfromthedummyvariable.
Now,weeliminateH_Distfromthemodel
Page|11
Figure7:StatisticresultsfromSPSS
R2=0.886,AdjustedR2=0.885andStandardErroroftheEstimate=46690.584
Ftest(Overalltest)
:
F=1201.765,pvalue=0.000whichislessthan0.05(95%confidentinterval)
We reject null hypothesis. We are 95% confident that there are significantly linear relationship
betweenindependentvariablesanddependentvariable.
Page|12
Ttest(Individualtest)
Werejectallnullhypotheses(
0:
0,
0:
0,
0:
0,
0:
0,
0:
0).
Allpvaluesaremorethan0.05.Weare95%confidentthattherearesignificantlylinearrelationship
betweeneachindependentvariablesandPrice.
6 Predictionmodel
Prce
42025.525
98.102H_Size
1212.041Age 1025.601Age_Dist
22962.084District
PredictionequationforDistrictA(District=0):
42025.525
98.102 _
1212.041
PredictionequationforDistrictB(District=1):
64987.609
98.102 _
186.44
Page|13