Professional Documents
Culture Documents
North-Holland
searchers are studying a "biologically inspired" problem remains of great interest to researchers
way of processing information. To this point, neu- as well as creditors, shareholders and auditors.
ral networks have proven to be good at solving Firm insolvency is a problem throughout the in-
some real-world problems, especially in the areas dustrialized countries of the world [4]. Creditors
of forecasting and classification decision prob- have a vested interest in this decision problem in
lems. that they wish to identify negative developments
The exploratory study presented in this paper of their borrowers. Stockholders hold similar
contrasts neural network predictive accuracy with monetary concerns. Auditors, as a normal re-
that of discriminant analysis for the decision sponsibility, must evaluate the financial position
problem of firm bankruptcy prediction. Using a of a client to determine whether or not the firm's
resampling methodological design, a series of ex- operating ability is endangered [3]. Thus, senior
periments was conducted to investigate the effect management of a firm and the board of directors
of the training and testing (holdout) set composi- can attempt to avert the crisis [34]. For all parties,
tion on predictive accuracy. These predictive re- it is essential that an objective opinion on the risk
sults were then contrasted with the accuracy ob- of bankruptcy can be formed as early as possible.
tained by classical discriminant analysis to deter- Predicting bankruptcy has been studied exten-
mine the conditions where neural network mod- sively in the accounting literature. The first stud-
els are significantly better predictors. ies were performed to determine whether finan-
The major objectives of this paper are the cial ratios provide useful information [1,5]. There
following. First, we report the results of a com- have been many different studies since [5] utiliz-
prehensive, statistically sound comparison of dis- ing financial ratios for bankruptcy prediction, a
criminant analysis and a neural network model. majority of which use a multivariate discriminant
Second, we utilize in this analysis measures of the analysis approach [1,2,4,7,12,25,32]. The major
validity of any classification technique which have evolution in these studies is to identify financial
been used extensively in the psychology litera- and economic variables which improve predictive
ture. These measures allow researchers to assess performance. Two statistical techniques appear
the true value added by a technique. Third, we to have been used the most: discriminant analysis
conclude with a brief conjecture on how neural and logistic regression [6]. No technique clearly
networks may affect decision support systems. provides substantially better results. We have
Our objective is not to examine speed of a new chosen to study discriminant analysis as a com-
algorithm, or study a new architecture, but rather parative classification technique because of its
to test a neural network's effectiveness in per- repeated use in many other problem areas.
forming classifications as contrasted against the Discriminant analysis is a statistical technique
incumbent techniques. Better algorithms should used to classify objects into distinct groups on the
only improve the performance. In this sense, our basis of an object' s observed characteristics. Ba-
results should provide a lower bound of the neu- sically, a linear discriminant function is developed
ral network model's predictive performance in which will compute a "score" for an object. This
bankruptcy prediction. function is a weighted linear combination of the
Section 2 briefly reviews bankruptcy prediction object's observed values on discriminating charac-
and the neural network literature. Section 3 de- teristics. These weights represent, in essence, the
scribes our comparison procedure. Section 4 pre- relative importance and impact of the various
sents the results, while section 5 discusses impli- characteristics. On the basis of its discriminant
cations of our results for researchers from three score, an object is then classified. Often, com-
areas: bankruptcy prediction, neural networks, puter software packages compute the probability
and decision support systems. of group membership on the basis of this proce-
dure [39].
Multivariate discriminant analysis is subject to
2. Brief review of relevant research a number of restrictive assumptions, including the
2.1. Bankruptcy risk prediction requirement for the discriminating variables to be
jointly multivariate normal. This multivariate nor-
In the present days of economic turmoil, it is mality of the variables is critical to the discrimi-
not surprising that the bankruptcy prediction nant analysis procedure; otherwise, results ob-
R.L. Wilson and R. Sharda / Bankruptcy prediction using neural networks 547
tained may be erroneous [25]. This theoretical There have been many other applications of
assumption often cannot be realized in practice neural networks in non-business related fields
[4]. such as speech recognition, robotics, radar detec-
tion and many others. Additionally, other net-
work paradigms have been useful in solving other
2.2. Neural network applications types of decision problems. Discussion of these
other applications and approaches is beyond the
Multi-layer, feed forward neural networks have scope of this paper.
been applied to many problem domains in and Most of the studies which have compared neu-
outside the business field. For instance, neural ral networks with statistical techniques report the
networks have been successfully trained to deter- results on the basis of either a single experiment
mine whether loan applications should be ap- or in an anecdotal form. There is a need for a
proved [20]. Similarly, neural networks have been thorough comparison using sound statistical pro-
shown to predict mortgagee applicant solvency cedures. Our study is based on a resampling
better than mortgage writers [11]. technique to assess the effectiveness of neural
Predicting rating of corporate bonds and at- networks on a statistical basis. Further, we bor-
tempting to predict their profitability is another row some measures from the psychology litera-
area where neural networks have been applied ture to isolate the value added by a classification
successfully [14,21]. Neural networks outper- technique. We argue that such measures ought to
formed regression analysis and other mathemati- be used in determining alleged superiority of any
cal modeling tools in predicting bond rating and such model.
profitability. The main conclusion reached was
that neural networks provided a more general
framework for connecting financial information 3. Method
of a firm to the respective bond rating.
Fraud prevention is another area of neural 3.1. Financial ratios and data collection
network applications in business. Credit card
fraud, a costly and difficult problem faced by
The basic intent of this study is to compare
banks, was addressed by Chase Manhattan Bank
and contrast the predictive performance of classi-
of New York by neural networks [33]. These
cal multivariate discriminant analysis to that of
models were shown to be much more successful
neural networks for firm bankruptcy. The Altman
than traditional regression analysis. Additionally,
study [1] has been used as the standard of com-
neural networks have been used in the validation
parison for subsequent bankruptcy classification
of bank signatures [19,30]. These networks identi-
studies using discriminant analysis. Most follow-
fied forgeries significantly better than any human
up studies have identified several other attributes
'expert'.
to improve prediction performance. In this ex-
Several people have tested the applicability of
ploratory study, we wanted to see if the neural
neural networks in financial markets. Collard [10]
networks can come close to the traditional tech-
states that this neural network model for com-
niques. More sophisticated inputs to the neural
modity trading would have resulted in significant
network model should not worsen its perfor-
profits over other trading strategies. Kamijo and
mance. Thus, this could establish a lower bound
Tanigawa [24] used a neural network to chart
on neural network performance in bankruptcy
Tokyo Stock Exchange data. Their finding were
prediction. For these reasons, we used the same
that the results of the model would beat a 'buy
financial ratios as Altman [1]. These ratios were:
and hold' strategy. Additionally, a neural model
for predicting percentage change in the S & P 500 XI: Working C a p i t a l / T o t a l Assets
five days ahead using a variety of economic indi- X2: Retained E a r n i n g s / T o t a l Assets
cators has been developed [18]. The authors claim X3: Earnings before Interest and T a x e s / T o t a l
that the model has provided more accurate pre- Assets
diction than alleged experts in the field using the X4: Market Value of E q u i t y / T o t a l Debt
same indicators. Xs: S a l e s / T o t a l Assets
548 R.L. Wilson and R. Sharda / Bankruptcy prediction using neural networks
BR = 1 NBR = 1
3.3. Implementation of comparative methods
Table 3
4.1. Training sets - " L e a r n i n g "
Testing set - correct classification (%) (all cases)
The first results to be presented display the Training set Testing set composition
Table 4 Table 5
Testing set - correct classifications (%) bankrupt cases Testing set - correct classifications (%) non-bankrupt cases
Training set Testing set composition Training set Testing set composition
composition 50/50 80/20 90/10 composition 50/50 80/20 90/10
NN DA NN DA NN DA NN DA NN DA NN DA
50/50 97.0 79.75 92.0 82.0 92.5 90.0 50/50 98.0 96.75 96.5 94.25 96.0 93.5
Table 8
Predictive validity of classifications
where g = groups (bankrupt and non-bankrupt), chance assignment. An index useful in such a
ng = number of test cases in group g , bg = training setting is the improvement-over-chance or reduc-
base rate of group g, og = observed correct pre- tion-in-error index [22,26],
dictions for group g (refer to Table 4), eg = I-Io -- l-Ie
expected correct predictions for group g by
I 1 -H e ' (3)
chance (ng* bg), O = total correct prediction
(~ Og), E = total correct predictions obtainable where H o is the observed rate of correct predic-
by chance (Y~ eg), N = total number of cases tions and He is the correct prediction rate ex-
(E //g) pected by chance. Using the previous notation,
Thus, this statistic will indicate whether pre- H e is defined as (E(bg * n g ) ) / N for the aggregate
dictive results obtained by neural networks and case, and be for each separate group. The index
discriminant analysis differ greatly from those I represents a reduction-in-error statistic in that
that can be obtained by chance. Additionally, one 1 0 0 . 1 % fewer prediction errors result using the
can also calculate a similar statistical measure for classification rule than would be expected by
each separate classification group. Using the same chance.
notation as above, the standard normal test statis- Table 9 provides this calculation for the neural
tics for each group (illustrating whether the pre- network and discriminant analysis predictions ag-
dictive results obtained by a classification tech- gregately across firm type, as well as the improve-
nique significantly differs from chance) is ment-over-chance index for both bankrupt and
non-bankrupt cases. Thus, when the 50-50 train-
( O g - e g ) * l ' l g 1/2 ing set is used to train a neural network, the
(eg * (F/g -- e g ) ) 1/2" (2) network model provides 92.98% fewer classifica-
tion errors than would occur by blind guessing.
As Table 8 indicates, the predictive validity of Also, as another example, the improvement-
neural networks and discriminant analysis is ex- over-chance for the prediction of bankrupt firms
tremely significant. Aggregately, both methods with neural networks trained on a balanced train-
are significantly better than pure chance regard- ing set is 91.48%. Also of interest is that even on
less of the base rate of the training sets. Consid- the 90% base rate where neural networks did not
ering the predictive validity by specific groups, indicate significant differences over chance pre-
the only non-significant result occurs when pre- dictions for non-bankrupt cases, the improve-
dicting non-bankrupt firms when the training set
base rate is 90%, though it is still considerably
better than chance. Not surprisingly, as previous Table 9
results have already shown, neural networks are Reduction-in-error of classifications
judged more statistically significant than chance
Training set Bankrupt Non-bankrupt Total
as compared to discriminant analysis in every
composition NN DA NN DA NN DA
case.
Another approach useful in assessing a predic- 50/50 91.5 61.8 93.7 89.7 93.0 81.0
80/20 56.2 41.9 92.1 79.6 69.1 55.4
tion method is determining how much better a
90/10 43.2 37.8 78.3 75.8 50.2 45.4
classification approach predicts compared to
554 R.L. Wilson and R. Sharda / Bankruptcy prediction using neural networks
bankruptcy prediction. Of course, the results of instances of each category. Since in the real-world,
any study are bound by the limitations of the data the decision maker may not have control over the
and methodology. composition of historical data necessary in the
Discriminant analysis classification rules often predictive model development, it appears that
incorporate prior probabilities that account for "smoothing" the distribution of the training set,
both the assumed base rate and the costs associ- irrespective of the actual distribution, will provide
ated with misclassification errors if different a better model.
[16,27]. In our comparison study, prior probabili- It is true that neural network performance is
ties were calculated from the base rates of the less impressive as the proportion of non-bankrupt
training sets. By using the base rates as the prior to bankrupt firms diverge. However, neural net-
probabilities, the discriminant analysis procedure work models continue to outperform discriminant
in this study actually incorporates significant un- analysis. If one follows the recommendation of a
equal misclassification costs (i.e., misclassifying a 50-50 training set, neural network performance
bankrupt firm is a more costly error), since the does not deteriorate significantly. Bankrupt firms
true population of bankrupt firms is probably less are predicted correctly in the 92% to 97% range,
than the training set base rate. Even so, neural with similar accuracy for non-bankrupt firms,
networks continually predicted bankrupt firms given a balanced training set.
more accurately using symmetric costs (testing One caution to this approach in developing the
threshold of 0.499). The major dilemma in utiliz- training set is also indicated in the experimental
ing the discriminant analysis model is in estimat- results. Significance of testing set composition in
ing the unequal misclassification costs. Future bankrupt firm prediction may have indicated
research investigating performance adjustments over-reported accuracy due to the small number
given explicit values for asymmetric misclassifica- of bankrupt firms in the 90-10 test sets. Thus, this
tion costs for both discriminant analysis and neu- study indicates that a potential trade-off exists
ral networks may be warranted. when creating training and testing sets from the
The investigation of the effects of different pool of existing problem data. A better predictive
training and testing set composition on the pre- neural network model can be created by using a
dictive results lead to further implications for the balanced training set; however, if too few of the
decision maker and neural network researcher. hard-to-classify or more important cases exist in
Results indicated that the composition of the the cross-validation set, the model performance
training set was a significant determinant of neu- could be over or under reported. Either way, this
ral network predictive accuracy. Basically, it was will significantly effect the accuracy of decision
shown that neural networks provide better under- maker confidence in the prediction model.
standing and differentiation between two con- The results of predicting non-bankrupt cases
cepts (bankrupt firms and non-bankrupt firms) improved as the imbalance of bankrupt to non-
when an equal number of examples of each con- bankrupt firms increased in the training sets. This
cept is used in the learning procedure. This result can be attributed to significant fewer number of
is not dissimilar to one's intuition and previous bankrupt firms in the training sets. This phe-
results in discriminant analysis [23]. nomenon illustrates that, at the expense of
While all prediction errors are undesirable in a "learning" about bankrupt firms, the network
specific methodology, it is generally accepted that "memorizes" and becomes very good at recogniz-
the incorrect prediction of a bankrupt firm as ing (i.e., predicting) non-bankrupt firms. While
non-bankrupt is the most costly error. Results overall predictive accuracy may remain high, the
have indicated that prediction of the bankrupt classification accuracy of bankrupt firms is seri-
firms poses the largest problem to the two differ- ously reduced. Thus, one would be significantly
ent techniques. Neural networks were shown to sacrificing the prediction performance of one im-
perform well in predicting both bankrupt firms portant category to marginally increase the pre-
and non-bankrupt firms when presented with diction performance on the other, easier pre-
equal numbers of examples in the learning phase. dicted category. In firm bankruptcy predictions,
Thus, a more accurate classification model will this is obviously not desirable. Thus, great care
result when developed with an equal number of must be taken when creating the training and
556 R.L. Wilson and R. Sharda / Bankruptcy prediction using neural networks
cross-validation sets when developing a neural multivariate discriminant analysis within the con-
network prediction model. text of forecasting firm bankruptcies on the basis
From a decision support systems perspective, of a small number of financial ratios. In this
this study has illustrated that neural networks are study, neural networks clearly outperformed dis-
a viable model that should be included in the criminant analysis in prediction accuracy of both
model base of a DSS. Predictive accuracy ob- bankrupt and non-bankrupt firms under varying
tained in this study illustrates the potential of training and testing conditions. Additionally, it
neural networks from a data reduction stand- was shown that neural networks offer a signifi-
point. With only five simple ratios, neural net- cant improvement in prediction over pure chance,
works predicted at a high rate of classification and that their use in prediction can reduce errors
accuracy; thus, these models may provide excel- in this problem domain by as much as 93% over
lent results with less data requirements than other chance.
approaches to the problem. Neural networks, therefore, represent a classi-
Discriminant analysis is not the only tool that fication technique that is a robust and promising
has been postulated for use in classification prob- approach in the prediction of firm stability. While
lems [13]. However, all other models do have this study is exploratory in nature and has some
limitations with regard to successful and appro- limitations as noted, it has shown the promise of
priate use. In the case of discriminant analysis, neural networks through the use of a set of solid
limitations include the requirement that the vari- statistical analyses that should be utilized as re-
ables should be jointly distributed according to a search continues in this area.
multivariate normal distribution, prior probability
specification, and so forth. Neural networks have
no such potential restrictive assumptions or re- Acknowledgments
quirements; they are more robust prediction tech-
niques. Thus, neural networks offer additional The authors wish to sincerely thank Marcus
benefits in reducing managerial concern over Odom and Nik Dalai for their help and assistance
choosing the appropriate model in the decision in data collection and in their insightful com-
support context. ments on previous drafts of this paper. Also, the
Much additional research needs to be done paper has greatly benefitted from comments and
regarding neural networks for bankruptcy predic- suggestions from the anonymous referees.
tion. The effect of network architecture, network
training algorithms and learning paradigms need
to be examined to provide more prescriptive re- References
sults on implementing a neural network predic-
tion model. As previously mentioned, this ex- [1] Altman, E.I., Financial Ratios, Discriminant Analysis
ploratory study uses only a small amount of vari- and the Prediction of Corporate Bankruptcy, The Jour-
ables to achieve its' high level of predictive accu- nal of Finance, (September 1968), 589-609.
racy; other variables should be included in the [2] Altman, E.I., Haldeman, R.G. and Narayanan, P., Zeta
Analysis, Journal of Banking and Finance, (June 1977),
neural network model [2]. Notable omissions in- 22-51.
clude the size of the firm, and time series data [3] Altman, E.I., Accounting Implications of Failure Predic-
(more than just one years' previous financial data), tion Models, Journal of Accounting Auditing and Fi-
among others. Additionally, using matched firms nance, (Fall 1982), 4-19.
by industry and year has been postulated to bias [4] Baetge, J., Huss, M. and Niehaus, H., The Use of Statis-
tical Analysis To Identify The Financial Strength Of
results in predicting bankruptcy [40]. Neural net-
Corporations In Germany, Studies in Banking and Fi-
works may or may not be affected by this, but nance, Vol. 7 (1988), 183-196.
additional research should study this issue. [5] Beaver, W.H., Financial Ratios as Predictors of Failure,
Empirical Research in Accounting: Selected Studies
(1966), 71-111.
6. Conclusion [6] Bell, T., Ribar, G. and Verchio, J., Neural Nets vs.
Logistic Regression: A Comparison of Each Model's
This paper has compared the predictive capa- Ability to Predict Commercial Bank Failures, working
bility of neural networks with that of classical paper, Peat Marwick Co., (May 1990).
R.L. Wilson and R. Sharda / Bankruptcy prediction using neural networks 557
[7] Blum, M., Failing Company Discriminant Analysis, Jour- [24] Kamijo, K. and Tanigawa, T., Stock Price Pattern Recog-
nal of Accounting Research, (Spring 1974), 1-25. nition: A Recurrent Neural Network Approach, Interna-
[8] Caudill, M., Neural Network Primer: Part III, AI Expert, tional Joint Conference on Neural Networks, San Diego,
(June 1988) 53-59. (June 1990).
[9] Caudill, M., Neural Network Training Tips and Tech- [25] Karels, G.V. and Prakash, A., Multivariate Normality
niques, AI Expert, (January 1991), 53-59. and Forecasting of Business Bankruptcy, Journal of Busi-
[10] Collard, J.E., Commodity Trading with a Neural Net, ness Finance and Accounting, (Winter 1987), 573-593.
Neural Network News, Vol. 2, No. 10 (October, 1990). [26] Klecka, W.R., Discriminant Analysis. (Sage Publishing:
[11] Collins, E., Ghosh, S. and Scofield, C., An Application of Beverly Hills, CA, 1980).
a Multiple Neural Network Learning System to Emula- [27] Lachenbruch, P., Discriminant Analysis. (Hafner Press:
tion of Mortgage Underwriting Judgments, working pa- NY, NY, 1975).
per, Nestor, Inc. (1989). [28] Meehl, P.E., Clinical versus Statistical Prediction: A The-
[12] Deakin, E.B., A Discriminant Analysis of Predictors of oretical Analysis and a Review of the Evidence. (Univer-
Business Failures, Journal of Accounting Research, sity of Minnesota Press: Minneapolis, 1954).
(Spring 1972), 167-179. [29] Meehl, P.E. and Rosen, A., Antecedent Probability and
[13] Denton, J., Hung, M. and Osyk, B., A Neural Network the Efficiency of Psychometric Signs, Patterns or Cutting
Approach to the Classification Problem, Expert Systems Scores, Psychological Bulletin, 52, (1955) 194-216.
With Applications, Vol. 1, (1990), 417-424. [30] Mighell, D., Back-Propagation and its Application to
[14] Dutta, S. and Shekhar, S., Bond-Rating: A Non-Con- Handwritten Signature Verification, in Advances in Neu-
servative Application of Neural Networks, Proceedings of ral Information Processing Systems I, D.S. Touretsky ed.
the IEEE International Conference on Neural Networks, (Kaufman Publishing: San Mateo, CA, 1989) 340-347.
San Diego, (1988) 443-450. [31] Morrison, D.G., On the Interpretation of Discriminant
[15] Eisenbeis, R. and Avery, R., Discriminant Analysis and Analysis, Journal of Marketing Research, Vol. 6, (1969),
Classification Procedures. (Lexington Books, Lexington 156-163.
MA, 1972). [32] Moyer, R.C., Forecasting Financial Failure: A Reexami-
[16] Eisenbeis, R., Pitfalls in the Application of Discriminant nation, Financial Management, (Spring 1977), 11-17.
Analysis in Business, Finance and Accounting, Journal of [33] Rochester, J. (ed.) New Business Uses For Neurocomput-
Finance, (June 1977) 875-900. ing, I/S Analyzer, (Feb 1990), 1-17.
[17] Farrington D.P. and Tarling, R., Prediction in Criminol- [34] Siegel, J.G., Warning Signs of Impending Business Fail-
ogy. (State University of New York Press, Albany, NY, ure and Means to Counteract such Prospective Failure,
1985). The National Public Accountant, (April 1981), 9-13.
[18] Fishman, M., Barr, D. and Loick, W., Using Neural [35] Stanley, J. and Bak, E., Introduction to Neural Networks.
Networks in Market Analysis, Technical Analysis of (California Scientific Software, Sierra Madre, CA, 1989).
Stocks and Commodities, (April 1991) 18-25. [36] Surkan, A. and Singleton, J., Neural Networks For Bond
[19] Francett, B., Neural Nets Arrive, Computer Decisions, Rating Improved by Multiple Hidden Layers, Interna-
(Jan. 1989) 58-62. tional Joint Conference on Neural Networks, San Diego,
[20] Gallant, S.I., Connectionist Expert Systems, Communica- (June 1990).
tions of the ACM, (February 1988), 152-169. [37] Teebagy, N. and Chatterjee, S., Inference in a Binary
[21] Goodman, R.M., Miller, J.W. and Smyth, P., An Infor- Response Model with Applications to Data Analysis,
mation Theoretic Approach to Rule-Based Connectionist Decision Sciences, Vol. 20, No. 2, (1989), 393-403.
Systems, in Advances in Neural Information Processing [38] Watts, R.L. and Zimmerman, J.L., Positive Accounting
Systems I, D.S. Touretsky ed., (Kaufman Publishing: San Theory. (Prentice-Hall, 1986).
Mateo, CA, 1989) 356-364. [39] Wilkinson, L., SYSTAT: The System for Statistics. (SYS-
[22] Huberty, C.J., Issues in the Use and Interpretation of TAT, Inc., Evanston, IL, 1989).
Discriminant Analysis, Psychological Bulletin, Vol 95 [40] Zmijewski, M.E., Methodological Issues Related to the
(1984), 156-171. Estimation of Financial Distress Prediction Models,
[23] Jain, A. and Chandrasekaran, B., Dimensionality and Journal of Accounting Research, Vol. 22 Supplement
Sample Size Considerations in Pattern Recognition Prac- (1984), 59-82.
tice, in Handbook of Statistics, Vol. 2, P. Krishnaiah and
Kanal, L. eds. (North-Holland, 1982), 835-855.