Anda di halaman 1dari 17

Kalicut Bank Ltd.

: Application Of Financial Data Analytics To Increase Customer


Profitability
Abhilash S. Nair and Sony Thomas.

Ms. Shirin is a IIM Kozhikode alumnus of the class of 2000. She received a gold medal for her
academic performance and was recruited by Kalicut Bank Ltd. a bank head quartered in calicut.
During the last decade, with the advent of new private sector banks as well as global bankers, the
service orientation as well as the offerings of banks have undergone a sea change. The increased
competition has resulted in bankers increasing their dependency on fee based income

Retail Banking sector in India


The commercial banking segment in India comprises of three participants: (i) Public Sector
Banks, (ii) Private Sector Banks and (iii) Foreign Banks. Privatisation in the sector was initiated
in 1993. Prior to that, public sector used to share more than 90 percent of total assets in the
banking system. Post privatization by 2004 this was reduced to 75 percent with the rest of the 25
percent being serviced by private sector banks. One measure of concentration of banking used
across is the k-bank concentration ratio of the assets held by a certain number of banks to the
total assets in the banking system. A popular measure is the three bank concentration ratio
(Bikker, 2004; Barth et. al., 2004, Claessens and Laeven, 2004). However, this measure is based
on a arbitrary cut off point and does not require the sample to be uniform in size. Hence, a better
measure is to use the 5th percentile of the largest banks.

Based on this five bank measure, the asset concentration in Indian banks seems to have declined
from 0.51 in 1991-92 to 0.41 in 2003-03 and the concentration in loans disbursed showed a
decline from 0.68 in 1991-92 to 0.41 in 2003-04. Thus it the entry of new banks in private sector
has strengthened competition (Ghosh and Prasad, 2005). The increased competition starts putting
pressure on the spreads being enjoyed by the bankers and hence the need of the hour is two fold:
(i) increase credit at competitive terms as well as (ii) guard against non performing assets. While
on the one hand lending had to be liberalized, one had also to be careful as to which customers
deserve the liberalized lending policy. One way out was to understand each customer and tailor
make products for each of them.
Meanwhile at Kalicut Bank
The top management of Kalicut Bank having realized that they have to change with the
environment decided to put together a team to customize offerings for each set of clients. To
begin with they decided they would do so in the Malabar region and thereafter depending upon
the impact of this move on the bottom line would decide whether and how to replicate this model
for the rest of its branches in India. Having recognized this move to be a determining factor in
their becoming a successful scheduled commercial bank, they decided to give charge of this
project to Shirin. On hearing this, Shirin was extremely tense. She realized that during the last
decade in banking she has become too mechanical in her thinking and almost has lost out on the
art of abstract thinking. She recollected that she had studied something on customer value in her
classes on Asset valuation which she had learnt in the fifth term of her PGP. She went back and
looked at her notes and found that the loss of memory property for efficient markets is applicable
in real life too…

Consequently, she decided to talk Prof. S.G., a Chair Professor in Banking at IIMK, who has
spent about three decades in banking and is now in academics for more than a decade. He has
also been a consultant to Kalicut Bank and recollects that it was about two years back that he had
recommended the bank’s management to tailor make schemes. For this purpose, he had analysed
the factors that determine the probability that a customer would respond to a particular stimulus
(scheme), the probability that a customer may buy insurance, probability that a customer may pre
pay his liability, probability that a customer may apply for refinance (before the previous loan
tenure is over) and most importantly the probability of default for each customer. He had worked
this out by analysing the entire client list of the bank across different branches all over India. On
being told about the project by Shirin, he felt satisfied that though two years had passed since he
had given the report, at least now the bank realized that it needs to change.

Prof. S.G.’s Model

The report prepared by Prof. S.G. analysed five distinct models to arrive at the probability of
customer response (model 1), probability of insurance sale (model 2), probability of pre payment
(model 3), probability of refinance (model 4) and probability of default (model 5). The results of
his study are given in appendix 2.a. to 2.e.. Since the outcome of each of the models was a
probability, a logistic regression based model was specified as given below:
ln[p/(1-p)] = α + βX + e

where p is the probability that event y occurs [p(y=1)]. In terms of the problem at hand, the
probabilities so arrived at indicate the chances that (i) the customer responds favourably to an
offer made by the bank, (ii) he/she applies for refinance and so on and so forth. Where, X
represents a matrix of the various explanatory variables. Only the variables that are significant
are reported in the results.

Work on hand for Shirin


For the Malabar region, the existing lending rate, for customers of different credit worthiness, as
assessed by CIBIL, is given in Appendix 1.a.. Panel 1 of Appendix 1.a. contains details
pertaining to borrowers who do not pre pay their loans and details in Panel 2 & Panel 3 pertain to
borrowers who might apply for refinancing and details in Panel 4 pertain to clients who might
pre pay the loan. Each of the five models and their variable specification is given below:

Model 1: Odds that a customer shall respond to a certain stimulus that the bank may provide.
ln[p/(1-p)]= α + β1X1 + β2X2 + β3X3 + β4X4+ β5X5 + β6X6 + β7X7 + β8X8+ β9X9 + β10X10 + β11X11 + β12X12

Where,
(i) X1, X2, X3, and X4,depict customers whose current borrowing rates are between 14.24 &
23.8%; 23.8 & 25.95%; 25.95 & 26.99% and 29.2 & 33.65%.
(ii) X5, X6, X7, and X8 depict customers whose current loan amounts are between Rs.5,523 &
Rs. 8,023; Rs. 8,023 & Rs. 10,523; Rs. 10,523 & Rs. 15,523 and Rs. 15,523 & Rs. 20,523.
(iii) X9, X10, depict customers whose Net Monthly Income (NMI) of Applicant 1 is less than Rs.
2419 and whose NMI is greater than Rs. 5170.
(iv) X11 depicts whether the loan is secured or not.
(v) X12 denotes the credit quality of the customer as given by CIBIL (Credit Information Bureau
of India Ltd.). The nomenclature for credit worthiness used by CIBIL is given in Appendix 1.a.
(vi) X13, X14denote the profession of the customer.

All the variables, discussed above, are categorical and hence a logit model (Hosmer and
Lemeshow, 2000; Kleinbaum and Klein, 2005) was found best suited to model the odds that the
said event may occur (For a review of Logit, refer Appendix 4).
Model2: Odds that a customer shall take insurance on his loan
ln[p/(1-p)]= α + β1X1 + β2X2 + β3X3 + β4X4+ β5X5 + β6X6 + β7X7 + β8X8+ β9X9 + β10X10 + β11X11 + β12X12 +
β13X13 + β14X14 + β15X15

Where,
(i) X1, X2, X3, and X4, depict NMI of applicant 1 between 0-Rs.1660, Rs. 1,660-2,299; 2,299-
2,787; > Rs. 3437.
(ii) X5, X6, X7, X8 and X9 depict customers who have currently borrowed between 0- Rs. 3,368;
Rs. 6,368 – 9,515; Rs. 10,523-14,101; Rs. 14,101- 20, 523 and greater than Rs. 20,523.
(iii) X10 describes customers whose borrowing rate is between 19.95% and 26.5%.
(iv) X11 indicates whether the customer has low credit quality (“FS” rating) or not.
(v) X12 denotes whether customers have dependents or not.
(vi) X13 denotes whether the tenure for which the current loan has been taken is between 6 and 24
or not.
(vii) X14 denotes whether the co applicant is an earning member or not.
(viii) X15 denotes the value of the real estate owned by the borrower.

Model3: Odds that a customer shall pre pay her loan (It has been observed that customers who
refinance their loans are the ones who pre pay. The proportion of the tenure for which they (client in
each refinance segment) keeps the loan is given in Panel 4 of appendix table 1.a.)
ln[p/(1-p)]= α + β1X1 + β2X2 + β3X3 + β4X4+ β5X5 + β6X6 + β7X7

Where,
(i) X1 indicates whether the current rate at which the customer is borrowing is greater than 19.5%
or not.
(ii) X2 and X3 indicate customers with high credit quality (“AA” and “A” rating) and low credit
quality (“E” and “FS” rating) respectively.
(iii) X4 and X5 indicate if the customer has currently borrowed between Rs. 5,523 & 10,368 and
between Rs. 10,368 & 17,523 respectively.
(iv) X6 indicates the actual tenure of their existing loan.
(v) X7 indicates the real estate value of the property currently held by the customer.
Model 4: Odds that a customer may apply for refinance (The refinancing rate depends on the prime lending
rate for each customer segment. The adjustment to the prime lending rate to arrive at the rate at which refinance
loans are made is given in Panel 2 of Appendix 1.a. Similarly the amount of refinance available depends on the
initial borrowing. The adjustment to initial borrowing to accommodate the refinanced borrowing is given in
Panel 3 Appendix 1.a.))

ln[p/(1-p)]= α + β1X1 + β2X2 + β3X3 + β4X4+ β5X5 + β6X6 + β7X7 + β8X8+ β9X9

Where,
(i) X1 depicts whether the loan is secured or not.
(ii) X2 and X3 indicate customers with existing loans with tenure less than 48 months and
greater than 60 months respectively.
(iii) X4 denotes the current refinance borrowing rate. As stated above the adjustment
needed to prime lending rate to arrive at the said rate for different refinance customer
segments is given in Panel 2 of Appendix 1.a.
(iv) X5 indicates the real estate value of the property currently held by the customer.
(v) X6 indicates the whether the net monthly income of the first applicant is lesser than
or equal to Rs.1877 or not.
(vi) X7 indicates whether the refinance loan amount is less than or equal to Rs. 3129 or
not (Panel 3: Appendix 1.a.).
(vii) X8 and X9 customers with low (“D” or “E” rating) and lowest (“F” or “FS” rating)
credit quality.

Model 5: Odds that a customer my go bankrupt


ln[p/(1-p)]= α + β1X1 + β2X2 + β3X3 + β4X4+ β5X5 + β6X6 + β7X7 + β8X8

Where,
(i) X1 and X2 indicate customers whose current borrowing rate is less than or equal to 13.95% and
greater than 32.95% respectively
(ii) X3 indicates customers with low credit quality (“E” or “F” or “FS” rating).
(iii) X4 indicates customers who have currently borrowed between Rs. 8,523 and Rs. 15,523.
(iv) X5 indicates customers with loan tenure between 42 and 60 months.
(v) X6and X7 indicates customers whose NMI of applicant 1 is between Rs. 3,250 & 3,832 and
greater than Rs. 3,832 respectively.
(vi) X8 indicates the real estate value of the property currently held by the customer.
A sample data pertaining to each of the variable described above is given in the tables in Appendix 2.
The results of the study are reported in Appendix 3.

Citing the results of the above models Prof. S.G. continued:

Prof. S.G.: You can use the coefficients as reported in my paper to arrive at an aggregate score
indicating the log odds of responding favourably, buying insurance, pre paying, applying for refinance
and defaulting.

Shirin: But what shall I do to them. It’s just a odds ratio?

Prof. S.G.: One can calculate the related probability measure based on the log odds ratio. The
probability of an event (customer response, insurance sale, pre payment, refinance and default)
occurring can be calculated as: p = 1/[1 + exp- (α + β1X1 + β2X2 + β3X3 + β4X4……+ βnXn)]. So now
you have the probability of occurrence as well as the outcome itself. So one can calculate the expected
value of the outcome (Expected interest income, pre payment fees, insurance premium and Expected
cost of funds, expected default (net of recoveries).

Shirin looked completely lost. She said, “well sir, I feel now that we have the probabilities we shall do
a cost benefit analysis i.e. , multiply different sources of income and expenditures with the respective
probabilities.

Prof. S.G.: But that’s assuming that they shall all respond favourably to the offer made to them.

Shirin: What offer?

Prof. S.G.: Well that’s what we started with, you would try to make the customer an offer that he
would find attractive and would also maximize your profits.

Shirin: A major source of Income for this bank would be the interest income on advances that last the
full term as well as those that are pre paid. But isn’t that what we are trying to find out that is what will
be a competitive interest rate that will make the loan attractive to the customer as well as maximize the
banks’ profits.

Prof. S.G.: Yes you are right but you shall need some basic data to start with and hence you would use
the customer’s current borrowing rate and loan amount to find the interest income he generates for the
bank.

Shirin: But what if the advance is pre paid?

Prof. S.G.: From Historic data you can classify customers into different loan bands and credit bands.
Subsequently you can find the proportion of total tenure for which they hold the loan (Panel 4
Annexure 1.a).

Shirin: OK once I know the time at which each customer shall repay, I can find the cumulative interest
on that loan.

Prof. S.G.: Don’t forget to use the results of model 3.

Shirin: Yes, shall find the expected interest income for both full term loans and pre paid loans. The
other sources of income for the bank shall be insurance income which is arrived at by multiplying the
Total loan amount*Insurance Penetration (20%)*Insurance Premium(Panel 4: Annexure 1.a)*Sales
commission (10%). Subsequently shall find the expected insurance income.

Prof. S.G.: Doesn’t your bank have any processing fee or early termination penalty.

Shirin: Oh yes sir, if the loan amount is more than Rs. 4,000 we charge Rs. 250 as processing fees
otherwise we don’t. Also, we charge early termination penalty depending on whether the customer has
stayed levered for atleast one fourth of the total tenure or lesser. If the customer holds the loan for a
period less than one fourth of the total tenure we levy a Rs. 150 pre payment penalty or else Rs. 75.

Prof. S.G.: What about Refinance?

Shirin: Sir how can I compute the income from refinance for each customer? Only few of them shall
be going for it and that too how can I forecast the tenure, amount and rate?
Prof. S.G.: Based on historic data you could tabulate the amount and rate for each customer segment.
Well for tenure you may have to assume that they refinance for the same tenure as the existing loan.

Shirin: O.K. so that takes care of the sources of income. As for costs the main cost I feel is the
opportuinity cost of the funds lent. Since I receive the EMI at the beginning of the month, I loose
interest for t-1 time periods on EMIt, interest for t-2 time periods for EMIt-1 and so on. I can do this
calculation for the loans held for full term as well as pre paid loans. Am I right.

Prof. S.G. Well that’s one of the approaches to find the costs but nevertheless this approach should
serve our purpose. What about defaults?

Shirin: Can find the expected default with the help of model 5. It has been observed that our debt
recovery office is able to collect up to 20% of the defaulted amount.

Prof. S.G.: Good. So it seems you can now carry forward the analysis on your own. Read Appendix
1a. to 1E of my paper to get hold of the basic data, the loan band credit bands, the behaviour of
customer segments towards pre payment, refinancing rate and refinancing tenure. Appendix 2a to 2e
details the data and important results of each of the five models. Appendix 3 and 4 is for people with
the lack of memory property ☺.

Questions

By segmenting customers and by tailor making schemes for each of them, by how much can you
improve the value of the customer to the bank?

Post segmentation by how much has the response of the customers to the offers made by bank
increased?

If you are asked to calculate the life time value of a customer for a retail concern, what changes will
you make to the model proposed by Prof. S.G.?
References

Barth, J.R., Capiro, Jr., G. and Levine, R., 2004, “Bank Regulationand Supervision:What works
Best?’ Journal of Financial Intermediation, 13, pp. 205-48.

Bikker, J.A., 2004, Competition and Efficiency in a Unified European Banking Market,
Cheltenham:Edward Elgar.

Claessens, S., and Laeven, L., 2004, “What Drives Bank Competition? Some International
Evidence,” Journal of Money, Credit and Banking, 36, pp. 563-83.

Ghosh, S. and Prasad, A., 2007, “Competition in Indian Banking: An Empirical Evaluation,”
South Asia Economic Journal, 8, pp. 265-84.

Hair, J.F, Black,B., Babin,B., Andersen, R.E and Tatham, R.L., 2005., Multivariate Data
Analysis, Prentice Hall.

Hosmer, D.W., and Lemeshow, S., 2000., Applied Logistic Regression, Wiley-Interscience.

Kleinbaum, D.G., and Klein, M., 2005., Logistic Regression – A Self Learning Text, Springer.
Appendix 1.a.: Loan Amount and Lending Rates for different Loan Band and
Credit Band combinations for Full Term as well as Early Termination
Panel 1: Interest Rate for Full Term Financing
8
CIBIL 1 2 3 4 5 6 7 <=2052
Rating <=4522 <=6523 <=10522 <=13523 <=16523 <=20522 <=20523 3
AA 15.45 14.24 14.24 13.24 13.24 13.24 11.99 11.99
A 23.45 14.24 14.24 13.24 13.24 13.24 11.99 11.99
B 27.45 14.24 14.24 13.24 13.24 13.24 11.99 11.99
C 29.45 29.45 29.45 29.45 28.45 18.45 18.45 18.45
D 34.95 34.95 34.60 33.95 29.95 25.95 25.95 25.95
E 34.95 34.95 34.95 34.95 30.45 26.45 26.45 26.45
FS 34.95 34.95 34.95 34.95 34.45 28.45 28.45 27.45
Panel 2: Refinancing Rate (in addition to PLR) for Different Loan Bands
CIBIL 1 2 3 4 5 6 7 8
Rating <=3023 <=4023 <=5523 <=8523 <=13523 <=16523 <=20523 <=20523
A 0.00 -0.41 -0.18 -0.13 -0.15 0.00 0.00 0.00
AA 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
B -0.08 -0.13 -0.10 -0.14 -0.08 -0.09 0.00 -0.09
C 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
D 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
E 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
FS 0.00 0.00 0.00 -0.06 0.00 0.00 0.00 0.00
Panel 3: Refinancing Amount (in addition to current borrowing) for Different Loan Bands
CIBIL 1 2 3 4 5 6 7 8
Rating <=3023 <=4023 <=5523 <=8523 <=13523 <=16523 <=20523 <=20523
A 150.00 119.02 43.33 44.54 25.49 18.47 12.39 3.40
AA 95.78 47.84 59.29 49.05 37.14 24.04 21.48 8.43
B 122.42 75.31 43.45 57.36 29.59 14.12 10.85 12.05
C 95.14 66.84 52.68 28.04 20.82 23.53 7.99 8.26
D 67.87 48.81 37.61 24.21 19.65 6.01 7.34 4.12
E 58.51 34.39 23.69 41.69 21.55 14.50 7.87 0.11
FS 185.96 40.24 2.13 14.98 13.75 10.39 0.00 0.00
Panel 4: Pre Payment Probability for Different (Refinancing) Loan Bands
CIBIL 1 2 3 4 5 6 7 8
Rating <=3023 <=4023 <=5523 <=8523 <=13523 <=16523 <=20523 <=20523
A 0.313 0.333 0.227 0.217 0.139 0.172 0.162 0.152
AA 0.370 0.311 0.219 0.160 0.170 0.115 0.157 0.133
B 0.190 0.213 0.208 0.123 0.160 0.123 0.121 0.143
C 0.254 0.186 0.165 0.164 0.164 0.156 0.139 0.141
D 0.208 0.187 0.184 0.168 0.167 0.153 0.175 0.203
E 0.259 0.243 0.183 0.159 0.153 0.162 0.161 0.189
FS 0.221 0.172 0.202 0.217 0.173 0.244 0.150 0.067

Panel 4: Insurance
Premium Notes: (i) Panel 2 presents the adjustment to be made to
Tenure PLR to arrive at the Refinance Borrowing/Lending rate.
(in (ii) Panel 3 presents the adjustment to be made to
months) Premium (%) current borrowing amount to arrive at the Refinancing
<=36 8 Borrowing/Lending amount.
<=54 12 (iii) It is assumed that prepayment is made only by
<=80 14 clients who are going to apply for refinancing facility.
>80 16 (iv) Panel 4 indicates the proportion of the loan
tenure for which a pre paying client will hold the loan.
Appendix 2: Sample Data (complete data shall be provided by the authors on request)

PLR for Current Loan


Real Each Amount Loan Band Band for
Customer Estate No. of Secured/Unse CIBIL Customer of for Full Term
No. Value NMI_App1 NMI_App2 dependents Profession cured Tenure Rating Segment Borrowing Refinancing Financing
1 B 3437 0 0 7 1 60 A 10.75 15855 6 5
2 U 3705 0 0 3 0 36 D 26.99 7091 4 3
3 P 2700 0 0 4 1 60 D 29.95 12523 5 4
4 O 2787 0 1 2 0 60 D 10.75 8523 4 3
5 B 5223 0 2 14 0 60 D 10.75 10523 5 4
6 O 2797 0 0 2 0 12 D 34.95 3023 1 1
7 O 2351 0 0 7 0 60 C 28.95 8523 4 3
8 U 2930 0 0 12 0 18 D 34.95 2303 1 1
9 B 3555 0 2 12 0 60 B 21.07 10723 5 4
10 B 3588 0 2 2 0 60 C 28.95 5523 3 2
11 P 2675 0 0 12 0 60 D 34.95 7523 4 3
12 P 2390 0 0 6 0 24 D 34.95 1273 1 1
13 U 2730 0 2 3 0 42 E 34.95 1523 1 1
14 B 3318 0 0 16 0 60 D 34.95 3523 2 1
15 U 3348 0 0 3 0 60 D 18.00 9640 5 3
16 P 3220 0 2 3 0 60 C 26.99 10294 5 3
17 P 2570 0 0 6 0 60 E 34.95 4523 3 2
18 U 2659 0 0 7 0 12 D 34.95 3523 2 1
19 O 2111 0 0 12 0 12 D 34.95 2523 1 1
20 B 3749 0 2 14 0 60 B 15.44 14523 6 5
Notes:
(i) B indicates the client owns real estate.
(ii) If the loan is secured, then “1” otherwise “0”
(iii) CIBIL ratings range from “AA” to “FS” with “AA” being the best to “FS” being the worst.
(iv) Loan bands for refinancing and full term financing are as given in Appendix 1.a.
Appendix 3: Results of Prof. S.G.’s Study

variables Model 1: Probability of Response


α INTERCEPT -0.2428
β1 14.24<Optimised Borrowing Rate<=23.8
0.4003
β2 23.8<Optimised Borrowing Rate<=25.95
-1.1488
β3 25.95<Optimised Borrowing Rate<=26.99
-1.0206
β4 29.2<Optimised Borrowing Rate<=33.95
-0.8595
β5 5523<Current Borrowings<=8023
0.8649
β6 8023<Current Borrowings<=10523
0.4364
β7 10523<Current Borrowings<=15523
0.522
β8 15523<Current Borrowings<=20523
0.2994
β9 NMI<Rs. 2419
-0.4989
β01 NMI > RS. 5170
-0.4781
β11 Secured/Unsecured
-0.7556
β21 Low Credit Quality ("E"&"FS")
-0.9549
β31 Profession Codes ("3", "4","5" & "6")
0.4579
Profession Codes ("10", "11","12", "13" &
β41 "14") 0.375

variables Model 2: Probabiilty of buying Insurance


α INTERCEPT_INS 0.055
β1 NMI_Applicant_1< 1660) -0.8016
β2 1660<=NMI_Applicant_1<2299) 0.1285
β3 2299<NMI_Applicant_1<2787) 0.2483
β4 NMI_Applicant_1 >3437) -0.2515
β5 Appl1 withl current borrowing<3368) -0.5144
6368<Applicant_1 withl current
β6 borrowing<9515) 0.1847
10523<Applicant_1 withl current
β7 borrowing<14101) 0.2751
14101<Applicant_1 withl current
β8 borrowing<20523) 0.2603
β9 Applicant_1 withl current borrowing>20523) 0.2736
β01 19.95%<Optimised Borrowing rate<26.5% 0.1004
β11 Low Credit Quality "FS" 0.0894
β21 No. of dependents 0.1253
β31 6<Tenure<24 -0.279
β41 Appllicant_2 is employed or not 0.3542
β51 Real estate value -0.5052
Appendix 3 Results of Prof. S.G.’s Study (Continued..)
variables Model 3: Probability of Prepayment
α Intercept -2.213

β1 Optimised borrowing rate >19.5% 0.1825


Customers with high credit worthiness ("A",
β2 "AA") -0.2851
Customers with low credit worthiness ("E", "F"
β3 & "FS") 0.2044
β4 5523<Current Borrowing<10368 0.1065
β5 10368<Current Borrowing< 17523 0.1875
β6 Tenure 0.00244
β7 Real Estate Value 0.5154

variables Model 4: Probability of Refinance


α Intercept -0.97
β1 Secured / Unsecured 0.421
β2 Tenure <= 48 months 0.3208
β3 Tenure > 60 months -0.3327
β4 Current Borrowing Rate>29.95 0.1582
β5 Real Estate Value -0.3713
β6 NMI_Applicant_1<=1877 -0.507
β7 Current Borrowing<3652 0.2079
β8 Low Credit Quality ("D", "E") -0.2491
β9 Lowest Credit Quality ("F", "FS") -0.7842

variables Model 5: Probability of Default


α intercept -3.25
Optimised Borrowing rate<=13.95 -
β1 1.6135
β2 Optimised Borrowing rate>32.95
0.5563
Customers with lowest credit worthiness ("E", "F" &
β3 "FS") 0.5925
β4 8523<Current Borrowing<=15523
0.1967
β5 42<Tenure<=60
0.2537
β6 3337<NMI_Applicant_1<=4019
0.2128
β7 NMI_Applicant_1>4019
0.2778
RealEstate Value -
β8 0.4102
Appendix 4: A Note on Logit

Logit is an approach to solve a discrete choice problem, i.e. a choice to buy or not, to go to
school or not, to default or not etc. In other words the dependent variable as well as one or more
of the independent variables are dichotomous (independent variables can be continuous too). The
standard linear probability models (eg. Y = γ + ϕX + e) fail to address such problems because of:
(i) Non normality of disturbances
(ii) Heteroskedastic disturbances
(iii) Non fulfillment of the predictions falling in the range of zero to one.

Hence, one needs a positive monotone function that maps the linear predictor (Y = γ + ϕX + e)
onto a unit interval. The transformation, while retaining the fundamental linear structure of the
model, avoids outcomes below zero or above one. Any cumulative probability density function
meets this idea (cdf).
π i = p ( yi ) = p( γ +ϕ x)
Hence, the transformation can either be a standard normal cdf or a logistic distribution.

If the transformation p(.) chosen is the cdf of a standard normal distribution:


γ +ϕ x 1
1 − ( γ +ϕ x) 2 dz
φ (π i ) =
2π ∫
−∞
e 2
, then it yields a linear probit model.

If the transformation p(.) chosen is the cdf of a standard normal distribution:

π i = Λ(γ +ϕ x)
1 , then it yields the linear logistic regression or the linear logit model.
= − ( γ +ϕ x)
1+ e

Despite the similarity, the logistic regression model enjoys certain, benefits over the probit
model:
(i) The logit model is simpler particularly in the case of polytomous data where one may
require multivariate logistic or normal distribution.
(ii) The output of a logit analysis is more simpler and easier to interpret.
(iii) The logit is symmetric around zero and unbounded above and below, making it a
good candidate for the response variable side of a linear model.
The present note focuses exclusively on binary logit. A binary "logit" model can be represented
as:
ln[p/(1-p)] = α + βX + e (1)
where,
p is the probability that the event Y occurs, p(Y=1)
p/(1-p) is the "odds ratio"
ln[p/(1-p)] is the log odds ratio, or "logit"

The logistic distribution constrains the estimated probabilities to lie between 0 and 1.
The estimated probability is given by:
p = 1/[1 + exp(-α - β X)]
= 1/[1 + exp-(z)], where z=α + β X (2)
if you let α + β X =0, then p = .50
as α + β X gets really big, p approaches 1
as α + β X gets really small, p approaches 0
Equation (2) is known as the cumulative logistic distribution function where zi ranges from
−∞ to +∞ and pi ranges from 0 to1.

Features of the model


(i) As probability goes from 0 to 1, the odds ratio or the logit goes from −∞ to +∞ . In other
words, although the probabilities are bounded, the logit is not so bounded.
(ii) Although the logit or the odds ratio is linear, the probabilities are not.
(iii) β, the slope measures the change in the log odds (in favour of one decision vis a vis another
for a unit change in x and α is the value of the log odds in favour of the decision when x=0).
(iv) LPM assumes that the probability is linearly related to x, but the logit model assumes that
the log odds is linearly related to x.
Appendix 5: A Note on Optimisation in MS Excel

Optimization refers to choosing the best element from some set of available alternatives. In the
simplest case, this means solving problems in which one seeks to minimize or maximize a real
function by systematically choosing the values of real or integer variables from within an
allowed set.

The function f is called, variously, an objective function, cost function, energy function, or
energy functional. A feasible solution that minimizes (or maximizes, if that is the goal) the
objective function is called an optimal solution.

Generally, when the feasible region or the objective function of the problem does not present
convexity, there may be several local minima and maxima, where a local minimum x* is defined
as a point for which there exists some δ > 0 so that for all x such that
║x- x*║≤ δ ;
the expression
f(x*) ≤ f(x) holds;
that is to say, on some region around x* all of the function values are greater than or equal to the
value at that point. Local maxima are defined similarly.
Microsoft Excel uses solver add-in tool to perform optimization. Solver is part of a suite of
commands sometimes called what-if-analysis. With Solver, you can find an optimal value for a
formula in one cell— called the target cell— on a worksheet. Solver adjusts the values in the
changing cells you specify— called the adjustable cells— to produce the result you specify from
the target cell formula. You can apply constraints to restrict the values Solver can use in the
model, and the constraints can refer to other cells that affect the target cell formula. Use Solver
to determine the maximum or minimum value of one cell by changing other cells— for example,
you can change the borrowing rate and see the affect on the CLV of each customer. Solver
focuses on three types of cells namely target cell, constrained cells and adjustable cells. Our
objective is to maximize the target cell which is the total CLV of all the customers. This
maximization is performed by changing adjustable cells which stands for borrowing rates of
different segments of customers. The lower limit(11.99) and upper limit(35) of borrowing rates
are put as constraints. After Solver runs, we get optimal solution of total CLV.

The Microsoft Excel Solver tool uses the Generalized Reduced Gradient (GRG2) nonlinear
optimization code developed by Leon Lasdon, University of Texas at Austin, and Allan Waren,
Cleveland State University. Linear and integer problems use the simplex method with bounds on
the variables, and the branch-and-bound method, implemented by John Watson and Dan Fylstra,
Frontline Systems, Inc.

Anda mungkin juga menyukai