Professional Documents
Culture Documents
P. K. Viswanathan1
S. K. Shanthi1
Abstract
Credit score models have been successfully applied in a traditional credit
card industry and by mortgage firms to determine defaulting customer
from the non-defaulting customer. In the light of growing competition
in the microfinance industry, over-indebtedness and other factors, the
industry has come under increased regulatory supervision. Our study
provides evidence from a large microfinance institutions (MFI) in India,
and we have applied both the credit scoring method and neural network
(NN) method and compared the results. In this article, we demonstrate
the capability of credit scoring models for an Indian-based microfinance
firm in terms of predicting default probability as well the relative impor-
tance of each of its associated drivers. A logistic regression model and
NN have been used as the predictive analytic tools for sifting the key
drivers of default.
Keywords
Logistic regression, probability of default, MFI, neural network
1
Great Lakes Institute of Management, Kanchipuram, Tamil Nadu, India.
Corresponding author:
P. K. Viswanathan, Great Lakes Institute of Management, Kanchipuram, Tamil Nadu, India.
E-mail: viswanathan.pk@greatlakes.edu.in
2 Journal of Emerging Market Finance 16(3)
1. Introduction
As a sequel to the microfinance crisis which took place in Andhra
Pradesh in India in October 2010, the microfinance institutions (MFIs)
in India have come under stringent regulations. Among other things, the
regulators have sought to put a cap on the interest margins that MFIs can
receive from their operations. Given this fact, assessment of credit risk in
any MFI assumes great relevance. The Indian microfinance industry is
also in that critical growth phase where they are attempting to make the
transition from non-profit making entities to profit making enterprises,
capable of economic viability over the years to come. Increasing aware-
ness of investors and lenders to commercial viability, growing competi-
tion and shrinking returns in the microfinance market are the key factors
forcing MFIs to improve the efficiency of their lending operations. It is
in this context predicting credit default looms large. Predictive analytic
models are being used to estimate probability of default as well as dif-
ferentiating defaulters from non-defaulters of loan in terms of important
characteristics/variables. To our knowledge, no study relating to risk
measurement in microfinance sector, pertaining to India, has used credit
scoring methodology.
Credit scoring models envisage quantitative analysis of parameters of
past data on loans to predict the future in terms of default probability
assuming in future also the same parameters hold true. The models in
this regard do two things. First, they predict the probability of default,
and second, they provide a classification table giving defaulters and non-
defaulters predicted by the model alongside the actual defaulters and
non-defaulters thus enabling us to evaluate the efficacy of the predictive
power of the model used. Scoring models primarily rely on the use of
enormous computing power available today to predict and classify prob-
ability of defaulters using advanced statistical techniques.
3. Data
We have used primary data collected from a leading MFI ABC in Tamil
Nadu. The company has been in the business of extending microcredit to
people who are unable to get finance from the mainstream banking
avenues. In this context, the alternative source of securing finance is from
private money lenders whose rates of interest are in the vicinity of 30–100
per cent. The mission of ABC is to make available finance at reasonable
cost to such customers in a transparent manner and, in the process, tries to
achieve acceptable returns on investment to ensure economic viability.
ABC is one of the largest microfinance companies in India. As of July
2014, the company has a client/beneficiary base of close to two million,
employee strength of about 3400 and total outstanding credit of about
`1.73 billion. The company currently has about 335 branches spread
over India.
by their teams and the sales officer communicates the salient features of
the company’s schemes to prospective borrowers. The applicants
are then screened for their credit risk. Criteria used include length of stay
in the same place of residence, nature of business, income, expenditure,
age, caste, among others.
4. Methodology
Logistic regression is a variation of ordinary regression in which the
dependent variable is binary and it takes values 0 or 1. The dependent
variable is categorical and usually represents the occurrence or non-
occurrence of an event and the independent variables can be continuous,
categorical or both. Logistic regression has been widely used in the
financial service industry for credit scoring models. On theoretical
grounds, logistic regression is a more appropriate statistical tool than
linear regression, given the fact the dependent variable is categorical that
has two discrete classes in credit risk, namely, a customer is a defaulter
or a non-defaulter. Ordinary least squares (OLS) regression will be fraught
with problems in predicting probability of default which has to be
between 0 and 1. It cannot guarantee estimated probability will always
fall in the range 0–1. On the contrary, logistic regression will ensure the
estimated probability to fall in the range 0–1 because it is based on a
6 Journal of Emerging Market Finance 16(3)
For the purposes of this study, the following 10 variables have been
identified in the context of predicting credit risk, based on a detailed
discussion with the organisation. These are the variables that the micro-
finance organisations use in trying to understand the credit risks involved.
The modelling techniques are different and have better scientific under-
pinnings and expected to perform better. ABC Ltd. does not use either
the NNs or the logistic regression methodologies in their modelling pro-
cess. Therefore, the purpose of this work is also to provide them with a
modelling technique that performs better in predicting the credit risk.
1. Age
2. Total family members
3. Length of stay—duration of stay in the house
4. Loan amount requested—loan principal amount
5. Total income of family
6. Monthly expenses
7. Toilet—attached or public toilet
8. Type of house—tiled or RCC—concrete or sheet or thatched
9. Religion
10. Caste
Predicted
Overdue
Observed No Yes Percentage Correct
Overdue No 479 25 95.0
Yes 46 90 66.2
Overall Percentage 88.9
Source: SPSS Output.
The model performs well in terms of its overall predictive power. Out of
the actual cases of 504 which belong to ‘no overdue’, the model has
incorrectly predicted 25 as ‘yes overdue’, which is only 5 per cent of the
total sample size. This is a measure of type I error. Out of 136 cases
observed in the actual data which are in overdue category, the model has
incorrectly predicted them as ‘no overdue’ which amounts to 33.8 per
cent. This is a measure of type II error. The type II error is large, and
thus, we observe that the model has not been able to strike a proper bal-
ance between type I and type II errors though the overall predictive accu-
racy is satisfactory (88.9 per cent).
From Table 2, the following insights could be drawn: length of stay,
total income, loan amount required and expenses are overwhelmingly
significant, predictors of loan default based on 5 per cent level (see Table
2 where p values under column Sig are given).
1. Type of house and Age are highly significant at 5 per cent level
pointing they are good predictors of loan default.
2. Caste as a factor is overwhelmingly significant predictor of default
( p-value is very small) at 5 per cent level.
3. Total family (number of members) is moderately significant
(significant at 6.4 per cent) as a predictor of default.
4. EXP (B) column in the output in Table 2 has an interesting inter-
pretation. These are odds, and whenever the number is more than
1, the probability of default is more than 50 per cent, and it will
increase for every one additional unit of the concerned independ-
ent variable. By this criterion, we find that length of stay, total
income, total family, expenses, type of house and caste are critical
in assessing default behaviour.
5. Loan amount required has odds (0.999) almost close to 1 and
hence can be taken to be critical predictor of risk.
Table 2. Logistic Regression—Relative Importance of Variables
Logistic Regression- Variables in the Equation
Predicted
Per cent
Sample Observed No Yes Correct
Training No 325 11 96.7
Yes 21 75 78.1
Overall Per cent 92.6
Testing No 161 7 95.8
Yes 6 34 85.0
Overall Per cent 93.8
Source: SPSS Output.
Note: Dependent Variable: overdue.
Viswanathan and Shanthi 11
length of stay, total income, loan amount required, expenses, age, type of
house, total family, caste and toilet type.
7. Conclusion
In this research article, we have successfully demonstrated the capability
of credit scoring models for an Indian-based microfinance firm in terms
of predicting default probability. Further, we have been able to sift the
relative importance of each of its associated drivers. The strengths and
limitations of logistic regression and NN have been discussed in the con-
text of predictive power of credit risk modelling. In terms of predictive
power, NN outperforms logistic regression. The predictive accuracy of
NN is in the vicinity of 93 per cent for the training sample and 94 per
cent for the testing sample and is higher than logistic regression (88.9 per
cent). However, because of the inability to fully redress the shortcomings
of NN with regard to explanatory variables and precise form of equation,
we confine to the logistic regression model to predict the loan default.
We have synergised the advantages of both these techniques to confirm
the significant independent variables that impact the behaviour of credit
12 Journal of Emerging Market Finance 16(3)
References
Andrade, F. W. M. (2004). Development of risk model of portfolio of credit
portfolios of individuals. Doctorate thesis in Administration of Companies,
Escola de Administracao de Empresas of São Paulo, Fundacao Getúlio
Vargas.
Carmona, Charles Ulises De Montreuil, & Araújo, Elaine Aparecida. (2011).
Application of credit scoring models in the analysis of insolvency of a
Brazilian microcredit institution. Journal of Modern Accounting and
Auditing, 7(8), 799–812.
DeLurgio, S. A., & Hays, F. (2001). Understanding the financial interests in
neural networks. Credit and Financial Management Review, 7(3), 27–53.
Gangopadhyay, S., & Shanthi, S. K. (2012). Governance issues in Indian micro-
finance. In James R. Barth, Chen Lin & Clas Wihlborg (Eds), Research hand-
book on international banking and governance (pp. 696–706). Cheltenham,
UK: Edward Elgar Publishing.
Ghatge, A. R., & Halkarnikar, P. P. (2013). Ensemble neural network strategy
for predicting credit default evaluation. International Journal of Engineering
and Innovative Technology (IJEIT ), 2(7), 223–225.
Jain, Bharat A., & Nag, Barin N. (1995). Artificial neural network models for
pricing initial public offerings. Decision Sciences, 26(3), 283–302.
Lewis, E. (1992). An introduction to credit scoring. San Rafael, CA: The Athena
Press.
Maves, G. (1991). Perfecting prediction. Marketing. Retrieved from http://www.
accessmylibrary.com
Saunders, A. (1999). Credit risk measurement: New approaches to value at risk
and other paradigms. New York, NY: John Wiley & Sons.
Schreiner, M. (2000). A scoring model of the risk of costly arrears for loans from
affiliates of Women’s World Banking in Colombia. Women’s World Banking.
Retrieved 11 July 2013, from http://www.microfinance.com
———. (2004). Benefits and pitfalls of statistical credit scoring for microfi-
nance. Retrieved from http://www.microfinance.com
Viswanathan and Shanthi 13