Professional Documents
Culture Documents
SYS 302
Spring 2000
Professor Tony Smith
Yale Chang
Carl Yeung
Chris Yip
TABLE OF CONTENTS
I.
INTRODUCTION
A. Chosen Economic Variables
B. Assumptions on the Regression Model
II.
ANALYSIS
A. Single Regression Models of TCB 500 Against Indicators
B. Preliminary Multiple Regression
C. Multicollinearity
D. Choosing Variables With the Stepwise Regression Model
E. Gauss-Markov Assumptions: Heteroscedasticity and Autocorrelation
F. Predictive Abilities of the Regression Models
III.
CONCLUSION
A. Single Regression Discussion
B. Multiple Regression Discussion
IV.
SUPPLEMENTS
A. Appendix A
B. Appendix B
I. INTRODUCTION
Every month, anxious investors eagerly await the release of key economic
indicators such as the employment report, CPI, and even housing starts. It is not
uncommon for the Dow Jones Industrial Average and NASDAQ to swing more than a
hundred points when the numbers only slightly miss consensus estimates. Every indicator
is an important measure of some facet of the domestic economy, but do these numbers
really shape the movement of stock prices in the long run? Which indicators yield the
most influence on the equity market? Can a model consisting of these indicators be
constructed to accurately forecast the stock market? And are any single indicators a good
predictor of stock prices? As curious investors ourselves, we developed a statistical
model in an attempt to detect a trend between stock prices and such variables and
evaluated the predictive abilities of the model.
Data was obtained from The Conference Board Economic Indicator Package,
provided by Wharton Research Data Services (WRDS). Monthly time series data was
obtained for stock market prices and a selection of economic indicators over a span of
twenty years, from January 1979 to January 1999. This period was chosen because of the
relative stability of the economy, the nations minimal exposure to severe external shock
(i.e. wars), and the comprehensiveness of the data. The stock market index provided by
the Conference Board is the TCB 500 common stock index, which is not commonly
quoted; each data point represents the indexs closing price for the given month. This
index was employed in our analysis because it represents the stock market more fully
than the Dow Jones Industrial Average, which includes only thirty stocks. Furthermore, a
comparison of the TCB 500 and the SP500 revealed that the two indices are almost
identical, as the single regression shows below:
SPX By TCB 500 Stock
1300
1100
900
S
800
P
X
600
500
300
100
0
0 100
300
500
700
500 Stock
900
1100
1300
Linear Fit
Linear Fit
SPX = 0.05124 + 1.00486 500 Stock
Summary of Fit
Rsquare
0.997687
RSquare Adj
0.997678
Root Mean Square Error 12.80636
Mean of Response
370.3486
Observations (or Sum Wgts)
241
A time series graph comparing the two indices is also shown below:
SP 500 vs TCB 500
1400
1200
1000
TCB 500
SP 500
800
600
400
200
0
time (1979-1999)
Personal Income:
14. Personal income less transfer payments (AR, bil. chain 1992 $)
15. Index of consumer confidence (1985=100) COPYRIGHTED (The Conf Bd)
4
II. ANALYSIS
A. Single Regression Models of TCB 500 Against Indicators
To begin our study, single regression models of the TCB 500 index were run
against each economic indicator to obtain a graphical interpretation of how well each
variable correlates with the stock market. The regression plots for each indicator are
attached at the end of the report as Appendix A. These plots show that the only indicators
which seem to display a smooth, consistent relationship with the TCB 500 index are the
following: index of ten leading indicators, manufacturing and trade sales, CPI, and
personal income. The polynomial fits of these four variables correlate surprisingly well
with the stock index, with R2 values of at least 0.95 (see Appdendix B). With this
information in mind, we proceeded to perform a preliminary multiple regression.
Term
Intercept
10 Leading Ind
Avg Wkly Hr
Parameter Estimates
Estimate
-2845.606
52.003024
-16.89806
0.987224
0.985999
31.25715
368.508
241
Std Error
879.1908
13.65247
11.42545
t Ratio
-3.24
3.81
-1.48
Prob>|t|
0.0014
0.0002
0.1406
UE Claims
Mfrs New Orders
Vendor Prfm
Bldg Permit
M2
Intrt Rate Spre
UE Rate
Capacity Util R
Mnfr & Trade Sa
Cntrct & Orders
PPI
CPI
Comd Prices
Pers Inc
Cnsmr Conf
Cnsmr Expt
FF Rate
Trade Balance
Ex Value USD
Source
10 Leading Ind
Avg Wkly Hr
UE Claims
Mfrs New Orders
Vendor Prfm
Bldg Permit
M2
Intrt Rate Spre
UE Rate
Capacity Util R
Mnfr & Trade Sa
Cntrct & Orders
PPI
CPI
Comd Prices
Pers Inc
Cnsmr Conf
Cnsmr Expt
FF Rate
Trade Balance
Ex Value USD
0.2057853
-0.003696
-0.738489
-0.162113
-0.564221
-18.52264
-0.023767
-18.84474
0.0046085
0.0024846
3.9181248
-17.05578
0.0406398
0.4029118
-3.88931
3.1739166
-9.068953
0.0001526
1.9144678
Nparm
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0.128894
0.001289
0.716969
0.024568
0.106776
3.92238
10.1186
3.565865
0.000586
0.001011
3.13577
2.667677
0.634698
0.097376
0.731443
0.659718
3.771536
0.00142
0.491674
Effect Test
DF
Sum of Squares
1
14175.342
1
2137.106
1
2490.361
1
8034.001
1
1036.540
1
42538.108
1
27280.134
1
21787.388
1
0.005
1
27286.563
1
60460.850
1
5895.660
1
1525.340
1
39936.970
1
4.006
1
16726.921
1
27623.796
1
22613.772
1
5649.062
1
11.283
1
14812.891
1.60
-2.87
-1.03
-6.60
-5.28
-4.72
-0.00
-5.28
7.87
2.46
1.25
-6.39
0.06
4.14
-5.32
4.81
-2.40
0.11
3.89
F Ratio
14.5089
2.1874
2.5490
8.2231
1.0609
43.5391
27.9221
22.3001
0.0000
27.9287
61.8836
6.0344
1.5612
40.8768
0.0041
17.1205
28.2738
23.1459
5.7820
0.0115
15.1615
0.1118
0.0045
0.3041
<.0001
<.0001
<.0001
0.9981
<.0001
<.0001
0.0148
0.2128
<.0001
0.9490
<.0001
<.0001
<.0001
0.0170
0.9145
0.0001
Prob>F
0.0002
0.1406
0.1118
0.0045
0.3041
<.0001
<.0001
<.0001
0.9981
<.0001
<.0001
0.0148
0.2128
<.0001
0.9490
<.0001
<.0001
<.0001
0.0170
0.9145
0.0001
This model shows an excellent fit, with almost 99% of the variance accounted for
(R2 = 0.987); it appears that the available data is sufficient to describe the movement in
stock prices. Not surprisingly, the four variables that demonstrated high correlation with
the TCB 500 from the earlier single regressions have also produced highly significant pvalues here.
On the other hand, not all of the variables are significant, such as the trade
balance, commodity prices, and the unemployment rate, which has a p-value of almost 1!
It seems illogical to claim that what is probably the most closely watched measure of
economic performance by Wall Street and the Fed has practically no effect on stock
prices. It is also strange that the coefficients for CPI and PPI have opposite signs even
though they both measure inflation. Similarly, the unemployment rate and unemployment
claims also have a negative correlation, as well as consumer confidence and consumer
expectations.
C. Multicollinearity
A possible explanation of these discrepancies might be multicollinearity, which
undermines the significance of the individual coefficients. To refine the model, a
correlation plot between the stock index and the economic indicators was drawn to
determine which variables are highly dependent:
Variable
10
Mfrs
Avg
UE
Bldg
Leading
New
Wkly Hr Claims
Permit
Indicies
Orders
10 Leading
Ind
Avg Wkly Hr
0.8968
M2
Intrt
Rate
Spread
UE
Rate
PPI
CPI
Pers
Inc
Cnsmr Cnsmr
Conf
Expt
FF
Rate
Trade Ex Value
Balance
USD
0.8968 -0.7549 0.8954 0.2309 0.9665 0.4231 -0.6713 0.326 0.9075 0.7075 0.8576 0.897 0.9349 0.5724 0.3774 -0.8185 -0.8714
1
-0.4352
0.8156
-0.4955
0.7468
0.6852
0.3242
-0.5006
-0.84
-0.5305
0.9115
-0.812 0.9078 0.2123 0.8006 0.3059 -0.7067 0.5284 0.8699 0.6887 0.7707 0.8261 0.857 0.5383 0.2914 -0.7119 -0.797
UE Claims
-0.7549 -0.812
Mfrs New
Orders
0.8954
0.9078 -0.7392
-0.7392 -0.4653 -0.6496 -0.1116 0.7539 -0.6826 -0.6369 -0.6018 -0.4328 -0.5041 -0.6095 -0.7254 -0.4463 0.4645
Bldg Permit
0.2309
M2
0.9665
Intrt Rate
Spread
0.4231
UE Rate
0.1647 0.8062 0.1685 -0.7852 0.5157 0.9746 0.8769 0.8053 0.878 0.9393 0.6249 0.2774 -0.6625
1
0.0964 0.1002 -0.1259 0.0688 0.0466 0.1329 -0.1604 -0.1047 -0.0072 0.5578 0.5351
1
-0.3827
0.4352
0.0489
-0.4777
0.7652
0.1916 -0.2923 0.1944 -0.1141 0.4303 0.3864 0.2715 -0.1331 0.1745 -0.7336 -0.272
0.0488
0.114
0.6413
0.5733
-0.6377
-0.5108
0.249
-0.5596
0.9541
-0.13
500
Stock
Capacity Util
Rate
Mnfr & Trade
Sales
Cntrct &
Orders
0.326
0.9075
0.7075
PPI
0.8576
0.7707 -0.4328 0.8053 -0.1604 0.8492 0.4303 -0.4541 0.1118 0.8853 0.6052
CPI
0.897
0.8261 -0.5041 0.878 -0.1047 0.877 0.3864 -0.5514 0.2052 0.9428 0.6909 0.9872
Pers Inc
0.9349
0.857 -0.6095 0.9393 -0.0072 0.915 0.2715 -0.7062 0.339 0.9862 0.809 0.9299 0.9707
Cnsmr Conf
0.5724
0.5383 -0.7254 0.6249 0.5578 0.4702 -0.1331 -0.6963 0.5033 0.5383 0.6707 0.2604 0.3325 0.4912
Cnsmr Expt
0.3774
0.2914 -0.4463 0.2774 0.5351 0.268 0.1745 -0.1141 -0.0114 0.2104 0.1954 0.1755 0.1679 0.2056 0.7214
-0.0825 -0.3832
-0.4607
0.8572
-0.4933
0.8176
-0.5441
0.8765
-0.5548
0.9282
-0.0033
0.5157
0.3137
0.2276
FF Rate
-0.8185 -0.7119 0.4645 -0.6625 -0.13 -0.8064 -0.7336 0.3413 -0.0743 -0.6957 -0.4037 -0.7474 -0.7755 -0.7396 -0.1388 -0.0825
Trade
Balance
0.8156
0.4345
-0.6009
0.2154
-0.7889
0.2154
-0.4427
-0.4427
-0.84 -0.3827 -0.8025 -0.272 0.6413 -0.2785 -0.8368 -0.7084 -0.711 -0.7662 -0.8315 -0.6671 -0.3832 0.6338
Ex Value USD -0.4352 -0.4955 0.3242 -0.5305 0.4352 -0.4777 0.0488 0.5733 -0.5108 -0.5596 -0.4607 -0.4933 -0.5441 -0.5548 -0.0033 0.3137 0.4345
500 Stock
0.6338
0.7468 -0.5006 0.9115 0.0489 0.7652 0.114 -0.6377 0.249 0.9541 0.8572 0.8176 0.8765 0.9282 0.5157 0.2276 -0.6009 -0.7889
From this correlation plot, the composite index of 10 leading indicators shows much
higher correlation with the following individual variables than with the stock index:
Average weekly hours, mfg. (hours)
Manufacturers' new orders
Manufacturing and trade sales
Vendor performance
Building permits for new private housing units (thous.)
Index of stock prices, 500 common stocks, NSA (1941-43=10)
Money supply, M2 (bil. chain 1992 $)
Interest rate spread, 10-year Treasury bonds less federal funds
Trade balance
Personal savings
PPI
CPI
Given the large number of dependent variables with such high correlations, further
economic research was conducted on these indicators; we later discovered that the index
of ten leading indicators actually includes many of the above variables. Most importantly,
the index of leading indicators includes the TCB 500 common stock index. As a result, a
second correlation plot was performed without the index of leading indicators:
Variable
500 Stock
500
Avg
UE
Stocks Wkly Hr Claims
1
Avg Wkly
0.7468
Hr
Mfrs
Bldg
New
Permit
Orders
M2
Intrt
Rate
Spread
UE
Rate
PPI
CPI
Pers
Inc
Cnsmr Cnsmr
Conf
Expt
FF
Rate
Trade
Balance
Ex
Vendor Comd
Value
Prfm Prices
USD
0.7468 -0.5006 0.9115 0.0489 0.7652 0.114 -0.6377 0.249 0.9541 0.8572 0.8176 0.8765 0.9282 0.5157 0.2276 -0.6009 -0.7889 -0.4427 0.0922 0.5992
1
-0.812 0.9078 0.2123 0.8006 0.3059 -0.7067 0.5284 0.8699 0.6887 0.7707 0.8261 0.857 0.5383 0.2914 -0.7119
1
-0.7392 -0.4653 -0.6496 -0.1116 0.7539 -0.6826 -0.6369 -0.6018 -0.4328 -0.5041 -0.6095 -0.7254 -0.4463 0.4645
Mfrs New
0.9115 0.9078 -0.7392
1
0.1647 0.8062 0.1685 -0.7852 0.5157 0.9746 0.8769 0.8053 0.878 0.9393 0.6249 0.2774 -0.6625 -0.84 -0.5305 0.3409 0.7227
Orders
Bldg
0.0489 0.2123 -0.4653 0.1647
1
0.0964 0.1002 -0.1259 0.0688 0.0466 0.1329 -0.1604 -0.1047 -0.0072 0.5578 0.5351 -0.13 -0.3827 0.4352 0.4549 -0.208
Permit
M2
Intrt Rate
Spread
0.1916 -0.2923 0.1944 -0.1141 0.4303 0.3864 0.2715 -0.1331 0.1745 -0.7336
1
-0.272
0.6413
0.4024 0.5588 0.1118 0.2052 0.339 0.5033 -0.0114 -0.0743 -0.2785 -0.5108 0.3459 0.6462
10
Mnfr &
Trade
Sales
Cntrct &
Orders
0.8606 0.8853 0.9428 0.9862 0.5383 0.2104 -0.6957 -0.8368 -0.5596 0.187 0.7112
0.8572 0.6887 -0.6018 0.8769 0.1329 0.6436 -0.1141 -0.8241 0.5588 0.8606
0.6052 0.6909 0.809 0.6707 0.1954 -0.4037 -0.7084 -0.4607 0.1874 0.6949
PPI
0.8176 0.7707 -0.4328 0.8053 -0.1604 0.8492 0.4303 -0.4541 0.1118 0.8853 0.6052
CPI
0.8765 0.8261 -0.5041 0.878 -0.1047 0.877 0.3864 -0.5514 0.2052 0.9428 0.6909 0.9872
Pers Inc
0.9282 0.857 -0.6095 0.9393 -0.0072 0.915 0.2715 -0.7062 0.339 0.9862 0.809 0.9299 0.9707
Cnsmr
Conf
Cnsmr
Expt
0.5157 0.5383 -0.7254 0.6249 0.5578 0.4702 -0.1331 -0.6963 0.5033 0.5383 0.6707 0.2604 0.3325 0.4912
0.2276 0.2914 -0.4463 0.2774 0.5351 0.268 0.1745 -0.1141 -0.0114 0.2104 0.1954 0.1755 0.1679 0.2056 0.7214
FF Rate -0.6009 -0.7119 0.4645 -0.6625 -0.13 -0.8064 -0.7336 0.3413 -0.0743 -0.6957 -0.4037 -0.7474 -0.7755 -0.7396 -0.1388 -0.0825
Trade
Balance
Ex Value
USD
Vendor
Prfm
Comd
Prices
-0.84 -0.3827 -0.8025 -0.272 0.6413 -0.2785 -0.8368 -0.7084 -0.711 -0.7662 -0.8315 -0.6671 -0.3832 0.6338
-0.4427 -0.4955 0.3242 -0.5305 0.4352 -0.4777 0.0488 0.5733 -0.5108 -0.5596 -0.4607 -0.4933 -0.5441 -0.5548 -0.0033 0.3137 0.4345
0.6338
0.2154
0.0922 0.4503 -0.575 0.3409 0.4549 0.1857 0.2906 -0.2084 0.3459 0.187 0.1874 0.0671 0.1159 0.1453 0.3686 0.3991 -0.2797 -0.3013 -0.0591
-0.0591 -0.6624
1
0.5992 0.6374 -0.5003 0.7227 -0.2018 0.4911 -0.1127 -0.7191 0.6462 0.7112 0.6949 0.6231 0.6529 0.6813 0.4724 0.1384 -0.2685 -.0.4527 -0.6624 0.1372
Based on the above grid, multicollinearity was still found among other variables, as
shown by the highlighted values above. As a result, the following indicators were also
removed: manufacturing new orders, manufacturing and trade sales, personal income,
and PPI. The final correlation plot is shown below:
Variable
Bldg
Permit
M2
Intrt Rate
Capacity Cntrct &
UE Rate
Spread
Util Rate Orders
CPI
Cnsmr Cnsmr
Conf
Expt
FF
Rate
500 Stock
0.7468
-0.5006
0.0489
0.7652
0.114
-0.6377
0.249
0.8572
Avg Wkly Hr
0.7468
-0.812
0.2123
0.8006
0.3059
-0.7067
0.5284
0.6887
UE Claims
-0.5006
-0.812
Bldg Permit
0.0489
0.2123
-0.4653
M2
0.7652
0.8006
-0.6496
Intrt Rate
Spread
0.114
0.3059
-0.1116
-0.6377 -0.7067
0.7539
UE Rate
Capacity Util
0.249
Rate
Cntrct &
0.8572
Orders
0.7539
-0.6826
0.6852
0.3242
-0.575 -0.5003
0.0964
0.1002
-0.1259
0.0688
-0.3827
0.4352
0.4549 -0.208
0.0964
0.4068
-0.6457
0.2577
0.6436
0.1002
0.4068
0.1916
-0.2923
0.0488
-0.8275
-0.13
0.6413
0.5284
-0.6826
0.0688
0.2577 -0.2923
-0.8275
0.5588
0.6887
-0.6018
0.1329
0.6436 -0.1141
-0.8241
0.5588
0.2052
0.6909
CPI
0.8765
0.8261
-0.5041
-0.1047
0.877
0.3864
-0.5514
Cnsmr Conf
0.5157
0.5383
-0.7254
0.5578
0.4702 -0.1331
-0.6963
0.5033
0.6707
0.3325
Cnsmr Expt
0.2276
0.2914
-0.4463
0.5351
0.268
0.1745
-0.1141
-0.0114
0.1954
0.1679 0.7214
-0.6009 -0.7119
0.4645
-0.8064 -0.7336
0.3413
-0.0743
0.6338
0.6413
-0.2785
0.2154
FF Rate
-0.797
0.6852
-0.13
-0.0825 -0.3832
1
0.3242
0.5733
-0.5108
0.4503
-0.575
0.4549
0.1857
0.2906
-0.2084
0.3459
0.1874
-0.0591
0.6374
-0.5003
-0.208
0.4911 -0.1127
-0.7191
0.6462
0.6949
-0.6624 0.1372
11
0.3991 0.1384
-0.0591 -0.6624
1
0.1372
1
0.1372
1
The table shows that much of the multicollinearity problem has been eliminated
through the removal of five indicators: index of leading indicators, manufacturing new
orders, manufacturing and trade sales, personal income, and PPI. However, it is important
to realize the impossibility of completely removing multicollinearity since all of the
indicators are related in some way through macroeconomic principles. Although the
removal of variables will slightly diminish R2, and hence the predictability of the model,
our objective is to find the best combination of the most significant and influential
independent indicators in the regression model. Although considerable correlation still
exists among certain variables, further removal of variables would prevent a thorough
analysis of the influence of these indicators on the stock market.
SSE
884727.99
Lock
X
_
_
_
_
_
_
_
Entered
X
X
_
X
X
X
X
X
DFE
228
Direction
Current Estimates
MSE
RSquare
3880.386
0.9472
Parameter
Intercept
Avg Wkly Hr
UE Claims
Bldg Permit
M2
Intrt Rate Spread
UE Rate
Capacity Util Rate
Estimate
2030.66465
28.4736768
?
-0.083008
-0.3213083
-32.04761
-58.964848
-29.704393
12
RSquare Adj
0.9444
nDF
1
1
1
1
1
1
1
1
Cp
9.256008
SS
0
9987.642
39.81886
27995.25
209604.7
147559.4
43864.13
121007.9
AIC
2004.185
"F Ratio"
0.000
2.574
0.010
7.215
54.016
38.027
11.304
31.184
"Prob>F"
1.0000
0.1100
0.9196
0.0078
0.0000
0.0000
0.0009
0.0000
_
_
_
_
_
_
_
_
_
X
X
X
X
X
_
_
_
X
0.01546194
7.55712093
1.37104117
1.51430471
-19.827616
?
?
?
-4.8035508
1
1
1
1
1
1
1
1
1
388882.6
308604.9
7634.109
11304.35
88482.79
533.1804
285.0882
26.21451
124197.3
100.218
79.529
1.967
2.913
22.803
0.137
0.073
0.007
32.006
0.0000
0.0000
0.1621
0.0892
0.0000
0.7117
0.7870
0.9347
0.0000
Step History
Step
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Parameter
CPI
Cntrct & Orders
Capacity Util Rate
Intrt Rate Spread
Avg Wkly Hr
Comd Prices
M2
UE Rate
Cnsmr Expt
FF Rate
Avg Wkly Hr
Bldg Permit
Avg Wkly Hr
Cnsmr Conf
Action
Entered
Entered
Entered
Entered
Entered
Entered
Entered
Entered
Entered
Entered
Removed
Entered
Entered
Entered
"Sig Prob"
0.0000
0.0000
0.0000
0.0001
0.0000
0.0018
0.0159
0.0000
0.0010
0.0001
0.5487
0.0749
0.1237
0.1621
Seq SS
12866812
2028176
464129.7
91356.8
104074.3
48692.8
28276.16
93389.68
47220.86
62452.44
1431.699
12548.41
9302.653
7634.109
RSquare
0.7683
0.8894
0.9171
0.9226
0.9288
0.9317
0.9334
0.9389
0.9418
0.9455
0.9454
0.9462
0.9467
0.9472
Cp
746.62
234.53
118.88
97.728
73.348
63.005
57.838
36.166
26.197
12.367
10.73
9.549
9.1911
9.256
p
2
3
4
5
6
7
8
9
10
11
10
11
12
13
The above stepwise regression shows that much of the multicollinearity problem has been
eliminated; for example, the unemployment rate now has a significant p-value, and
consumer confidence and expectation are no longer negatively correlated. However,
average weekly hours, consumer confidence, and consumer expectations are still included
in the model even though they exhibit high p-values. A possible explanation is that their
inclusion in the model contributes to a higher adjusted R2 value. After testing with
various combinations of the variables, the final stepwise regression model (step model 2)
is shown below:
Response: 500 Stock
Stepwise Regression Control
Prob to Enter 0.250
Prob to Leave 0.250
SSE
914213.16
Lock
Entered
DFE
231
Direction
Current Estimates
MSE
RSquare
3957.633
0.9454
Parameter
Estimate
13
RSquare Adj
0.9433
nDF
Cp
10.72975
SS
AIC
2006.086
"F Ratio"
"Prob>F"
X
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
_
X
_
_
_
X
X
X
X
X
X
_
X
X
_
_
_
X
Intercept
Avg Wkly Hr
UE Claims
Bldg Permit
M2
Intrt Rate Spread
UE Rate
Capacity Util Rate
Cntrct & Orders
CPI
Cnsmr Conf
Cnsmr Expt
FF Rate
Trade Balance
Ex Value USD
Vendor Prfm
Comd Prices
2744.42601
?
?
?
-0.3164773
-30.072942
-68.853244
-26.328913
0.01518301
8.62389599
?
2.16589636
-15.707916
?
?
?
-4.6080835
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
1431.699
1904.568
12548.41
217531.8
139643.7
109618.3
167767
482884.6
838177.3
216.337
163006.5
75646.95
0.215112
1169.58
64.92614
139365.5
0.000
0.361
0.480
3.201
54.965
35.285
27.698
42.391
122.013
211.788
0.054
41.188
19.114
0.000
0.295
0.016
35.214
Although the above model has a slightly lower adjusted R2 value than the previous
stepwise model (0.9433 < 0.9444), the entered indicators all show highly significant pvalues of p < 0.0001.
Using the set of indicators obtained from the results of the stepwise regression,
the standard least squares multiple regression was performed again to create a final linear
model (Model A). The regression is shown below:
Response: 500 Stock
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
Term
Intercept
M2
Intrt Rate Spre
UE Rate
Capacity Util R
Cntrct & Orders
CPI
Cnsmr Expt
FF Rate
Comd Prices
Source
M2
Intrt Rate Spre
Parameter Estimates
Estimate
2744.426
-0.316477
-30.07294
-68.85324
-26.32891
0.015183
8.623896
2.1658964
-15.70792
-4.608083
Nparm
1
1
0.945412
0.943285
62.90972
368.508
241
Std Error
483.8281
0.042687
5.062709
13.0828
4.043872
0.001375
0.592589
0.337484
3.592863
0.776534
Effect Test
DF
Sum of Squares
1
217531.76
1
139643.69
14
t Ratio
5.67
-7.41
-5.94
-5.26
-6.51
11.05
14.55
6.42
-4.37
-5.93
F Ratio
54.9651
35.2847
Prob>|t|
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
Prob>F
<.0001
<.0001
1.0000
0.5487
0.4891
0.0749
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.8157
0.0000
0.0000
0.9941
0.5878
0.8984
0.0000
UE Rate
Capacity Util R
Cntrct & Orders
CPI
Cnsmr Expt
FF Rate
Comd Prices
1
1
1
1
1
1
1
1
1
1
1
1
1
1
109618.27
167767.02
482884.62
838177.30
163006.51
75646.95
139365.53
27.6979
42.3908
122.0135
211.7875
41.1879
19.1142
35.2144
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
In this final revised model, the R2 value is 0.9454. Interestingly, three of the four good
variables identified from the single regressions have been removed in the refinement
process. Although the R2 value of our refined model is slightly less than the R2 value of
0.987 obtained in our preliminary multiple regression, we can be confident that the final
combination of indicators exhibits high significance, independence, and the most
influence on the TCB 500.
i N(0, 2), i = 1, , n
(ii)
Var (i) = 2, i = 1, , n
(iii)
(i,
. . . , n)
mutually independent
If the final least squares model fits the first Gauss-Markov assumption, the
residuals of the model must be normally distributed. A normal quantile plot of the
residuals is shown below:
15
200
.01
.05 .10
.25
.50
.75
.90 .95
.99
150
100
50
0
-50
-100
-150
-3
-2
-1
Normal Quantile
The above plot shows that the residuals are extremely close to being normally distributed;
therefore, the first Gauss-Markov assumption is valid for this regression model.
Next, for the new model to be accepted, possible violations of the constantvariance assumption must be tested for. The whole model test and the residual plot of the
multiple regression are shown below:
Whole-Model Test
1300
1100
900
800
600
500
300
100
0
-100
Source
Model
Error
C Total
100
300
500
500 Stock
700
900
Predicted
1100
1300
Analysis of Variance
DF
Sum of Squares Mean Square
9
15833148
1759239
231
914213
3958
240
16747362
16
F Ratio
444.5179
Prob>F
<.0001
150
e
s
100
i
d
50
u
a
l
-50
-100
-150
-100
100
300
500
500 Stock
700
900
Predicted
1100
1300
The whole model test plot shows that the data points are more scattered at higher
values of the x-axis, but the variances are not significantly increasing; there are no
predicted y values that terribly miss the mark. Similarly, the residual plot does not show
any significant trend of increasing variance, although the residuals appear more scattered
at higher values of x. Thus, the model can be accepted as fitting the constant variance
assumption, meaning that the increase in values of the economic indicators does not
produce overall increasing variance in the TCB 500.
Although the residual plot shows no significantly discrepant values, the residuals
seem to display a slightly cyclical pattern. So to test for autocorrelation in our model, the
Durbin-Watson test was conducted. The results are as follows:
Durbin-Watson
0.5751493
Durbin-Watson
Number of Obs.
241
AutoCorrelation
0.7038
For a one-sided test at = 0.05, the Durbin-Watson values for k = 11 and n = 200 are dL
= 1.65 and dU = 1.89 (W.H. Green, Econometric Analysis). This shows that our data
contains serious autocorrelation problems. This is not surprising because the data consists
17
of time series statistics that include business cycles and economic fluctuations. There are
two alternatives to solving the autocorrelation problem: 1) perform a two-stage
estimation procedure to modify the data by weighted differencing or 2) add additional
variables which can account for the apparent autocorrelation effect. Although the second
alternative is generally a superior approach, all of the x-variables used in our regressions
are economic indicators and therefore unavoidably reflect business fluctuations; if we add
any more of the variables that we eliminated, we would end up with our preliminary
model and still not be able to correct autocorrelation. Therefore, the two-stage estimation
procedure was performed, and the no-intercept regression of the residuals are shown
below:
Response: Residual 500 Stock
Summary of Fit
RSquare
RSquare Adj
Root Mean Square Error
Mean of Response
Observations (or Sum Wgts)
Term
Intercept
Lag Residuals
Source
Lag Residuals
Parameter Estimates
Estimate
Zeroed
0
0.6921693
Nparm
1
?
?
44.27803
0.136608
240
Std Error
0
0.04757
Effect Test
DF
Sum of Squares
1
415083.08
t Ratio
?
14.55
F Ratio
211.7183
Prob>|t|
?
<.0001
Prob>F
<.0001
18
0.860731
0.855281
32.79234
117.0965
240
Parameter Estimates
Estimate
870.58837
-0.29684
-21.77922
-93.16051
-23.98903
0.0040738
10.046571
1.6722569
-10.81747
-3.875311
Nparm
1
1
1
1
1
1
1
1
1
Std Error
133.8725
0.054307
5.145625
11.11445
4.057743
0.000847
0.761136
0.317441
4.50438
1.034767
Prob>|t|
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0171
0.0002
Effect Test
DF
Sum of Squares
1
32127.99
1
19264.29
1
75549.69
1
37583.84
1
24857.97
1
187351.23
1
29841.85
1
6201.93
1
15082.48
F Ratio
29.8771
17.9146
70.2567
34.9507
23.1164
174.2255
27.7511
5.7674
14.0258
Durbin-Watson
Number of Obs.
240
AutoCorrelation
0.5008
Durbin-Watson
0.9526597
T
.
1300
5
1100
800
450
400
350
300
250
900
Prob>F
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
0.0171
0.0002
Model Atransformed
Model A
600
500
300
200
150
100
50
100
0
-100
t Ratio
6.50
-5.47
-4.23
-8.38
-5.91
4.81
13.20
5.27
-2.40
-3.75
100
300
500
500 Stock
700
900
Predicted
1100 1300
19
100
200
300
T. 500 Stock
Predicted
400
After the two-stage estimation, the new value of d, 0.953, is still much less than
the critical value, and significant autocorrelation still exists (autocorrelation = 0.5008).
However, a tradeoff must be made between the value of d and R2; while d increased from
0.692 to 0.952, R2 has dropped to 0.861 from 0.954 (Model A) in the transformed
multiple regression. The whole-model test plot of the transformed regression
correspondingly shows that the fit has become poorer. The federal funds rate and
commodity prices also showed a large decrease in their p-values. This estimation
procedure could have been carried out with more lags, but collinearity would start to
become a problem since the lagged residuals are correlated with each other.
The tested model therefore does not satisfy the third Gauss-Markov assumption,
which states that the residual deviations are mutually independent. Because the
autocorrelation problem could not be eliminated to an acceptable extent, this model
would most likely not make an accurate forecasting tool.
20
800
600
400
200
0
-200
time (1977-1999)
500
400
300
200
100
0
-100
time (1977-1999)
We then gathered actual data on our indicators for a period outside the data range we
used to derive the models. As the graph below shows, the predicted trend is terribly off
the mark for the time period 1967-1977. We have found that the farther we depart from
our original data range, the larger the variances. Our regression model even predicted
negative values for the TCB 500! This highlights the fact that our model only works well
21
within the range of our original data and performs poorly in forecasting data outside this
range.
index value
200
Model Prediction
150
100
50
0
0
20
40
60
80
-50
-100
time (1967-1977)
22
100
120
140
160
IV. CONCLUSION
I. Single Regression
From our initial single regression plots, the four indicators that demonstrated the
strongest correlation with stock price were the index of ten leading indicators,
manufacturing and trade sales, CPI, and personal income. This result is not surprising: the
leading indicators are a broad measure of the economy and should move in sync with the
stock market; manufacturing and trade sales are a good indicator of overall economic
activity and output, similar to GDP, and therefore should correspond with stock prices;
the CPI measures the price level which generally increases with rising aggregate demand;
and the more income an individual has, the more stocks he/she is likely to buy.
Parameter Estimates
Estimate
Std Error
2744.426
483.8281
-0.316477
0.042687
23
0.945412
0.943285
62.90972
368.508
241
t Ratio
5.67
-7.41
Prob>|t|
<.0001
<.0001
-30.07294
-68.85324
-26.32891
0.015183
8.623896
2.1658964
-15.70792
-4.608083
5.062709
13.0828
4.043872
0.001375
0.592589
0.337484
3.592863
0.776534
-5.94
-5.26
-6.51
11.05
14.55
6.42
-4.37
-5.93
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
Based on the model, the stock market is negatively correlated with M2, interest
rates, unemployment rate, commodity prices, and capacity utilization rate and positively
correlated with CPI, consumer expectations, and manufacturing contracts and orders.
Stock prices should be inversely related to interest rates because higher rates imply
higher costs of borrowing money, and bonds would appear more attractive. Logically, the
economy would be in a recession given a high unemployment rate, so these two factors
are negatively related. Increases in CPI and consumer expectation generally imply
growing aggregate demand and high confidence in the economy, so they are reasonably
correlated with the stock index in this model. Contracts and orders for plant and
equipment is a measurement of investment in the economy, and therefore correlates
positively with the TCB 500 as well. However, M2 would be expected to correlate
positively with stocks since it includes money market funds. In addition, higher capacity
utilization rates would imply that firms are operating with higher efficiency and output,
and it should have been directly correlated with stocks as well, from an economic
perspective. The variables M2 and CPI should have a direct relationship as well since
money supply growth leads to a proportional rise in the price level; instead, they had
opposite signs.
Since not all of these variables are on the same scale, their coefficients cannot be
compared to evaluate the relative influence of each indicator on the stock index.
However, it was surprising that indicators such as commodity prices were more
24
significant than variables such as the trade balance, which would be assumed to have
bigger importance and more impact on the entire economy.
Multicollinearity was substantial in our preliminary multiple regression since all
the indicators are related according to economic theory; as more variables were
eliminated in the refinement process, the R2 value in our model decreased slightly in
return for more significant p-values and more logical coefficients. In fact, CPI was the
only one of the four quality variables from single regression analysis to remain in the
final set of indicators. This was the result of the removal of highly dependent variables.
In addition, those four indicators had good polynomial fits with stock prices, but the
multiple regression was based on linear relationships.
While the TCB 500 stock index was predicted fairly well by our data within the
same time range, it seems futile to attempt to forecast the stock market outside the range
using our set of economic indicators. It is surprising that given all of the measures of
economic performance that we used, a successful prediction model failed to be
developed. This could be partly due to high autocorrelation problems in the model, but
more importantly, it suggests that many other factors contribute to the movement of the
equity market than just the economic indicators.
25
IV. SUPPLEMENTS:
APPENDIX A:
APPENDIX B:
APPENDIX B
500 Stock By 10 Leading Ind
1300
1100
900
800
600
500
300
100
0
90
100
10 Leading Ind
Analysis of Variance
Sum of Squares
Mean Square
15941455
2656909
805906
3444
16747362
DF
6
234
240
Parameter Estimates
Estimate
Std Error
15611913
7426678
-611383.1
391548.6
6226.7869
9018.339
50.484734
122.3188
-1.477852
1.039863
0.0107058
0.005017
-0.000026
0.00001
Term
Intercept
10 Leading Ind
10 Leading Ind^2
10 Leading Ind^3
10 Leading Ind^4
10 Leading Ind^5
10 Leading Ind^6
200
100
0
-100
-200
90
100
10 Leading Ind
t Ratio
2.10
-1.56
0.69
0.41
-1.42
2.13
-2.59
F Ratio
771.4502
Prob>F
<.0001
Prob>|t|
0.0366
0.1198
0.4906
0.6802
0.1566
0.0339
0.0103
APPENDIX B
500000
600000
700000
Mnfr & Trade Sales
800000
DF
6
234
240
Term
Intercept
Mnfr & Trade Sales
Mnfr & Trade Sales^2
Mnfr & Trade Sales^3
Mnfr & Trade Sales^4
Mnfr & Trade Sales^5
Mnfr & Trade Sales^6
Analysis of Variance
Sum of Squares
Mean Square
16472623
2745437
274738
1174
16747362
Parameter Estimates
Estimate
-195090.1
2.321316
-0.000011
2.86e-11
-4.01e-17
2.945e-23
-8.87e-30
F Ratio
2338.343
Prob>F
<.0001
Std Error
113131.5
1.194856
0.000005
1.2e-11
1.55e-17
1.05e-23
2.96e-30
t Ratio
-1.72
1.94
-2.16
2.38
-2.59
2.80
-3.00
700000
800000
100
-100
400000
500000
600000
Mnfr & Trade Sales
Prob>|t|
0.0859
0.0532
0.0317
0.0182
0.0101
0.0055
0.0030
APPENDIX B
120
CPI
140
160
Analysis of Variance
Sum of Squares
Mean Square
16569482
2761580
177879
760
16747362
DF
6
234
240
Parameter Estimates
Estimate
Std Error
17244.23
16028.91
-1207.261
883.2916
34.333804
19.95
-0.505366
0.236511
0.0040617
0.001553
-0.000017
0.000005
2.8484e-8
7.6e-9
Term
Intercept
CPI
CPI^2
CPI^3
CPI^4
CPI^5
CPI^6
t Ratio
1.08
-1.37
1.72
-2.14
2.62
-3.15
3.75
F Ratio
3632.854
Prob>F
<.0001
Prob>|t|
0.2831
0.1730
0.0866
0.0337
0.0095
0.0018
0.0002
100
50
0
-50
-100
60
70
80
90
APPENDIX B
4000
4500
5000
Pers Inc
5500
6000
DF
6
234
240
Analysis of Variance
Sum of Squares
Mean Square
16522641
2753773
224721
960
16747362
Parameter Estimates
Estimate
Std Error
-523578.5
194766.6
733.69489
266.0705
-0.425889
0.150455
0.0001311
0.000045
-2.256e-8
7.548e-9
2.059e-12
6.7e-13
-7.78e-17
2.46e-17
Term
Intercept
Pers Inc
Pers Inc^2
Pers Inc^3
Pers Inc^4
Pers Inc^5
Pers Inc^6
t Ratio
-2.69
2.76
-2.83
2.91
-2.99
3.08
-3.16
F Ratio
2867.479
Prob>F
<.0001
Prob>|t|
0.0077
0.0063
0.0050
0.0040
0.0031
0.0024
0.0018
100
50
0
-50
-100
3500
4000
4500
Pers Inc
5000
5500
6000