Extant evidence shows that in the past two decades bankruptcies and de- faults have occurred at higher rates than at any time. Due to recent finan- cial crises and regulatory concerns, credit risk assessment is an area that has seen a resurgence of interest from both the academic world and the business community. Especially for credit-granting institutions, such as commercial banks and some credit card companies, the ability to discrimi- nate faithful customers from bad ones is crucial. In order to enable the in- terested parties to take either preventive or corrective action, the need for efficient and reliable models that predict defaults accurately is imperative.
The general approach of credit risk analysis is to apply a classification technique on similar data of previous customers – both faithful and delin- quent customers – in order to find a relation between the characteristics and potential failure. Accurate classifiers should be found in order to cate- gorize new applicants or existing customers as good or bad.
In the seminal paper, Fisher (1936) attempted to find a linear classifier that best differentiates between the two groups of satisfactory and unsatis- factory customers based on statistical discriminant analysis. Nonlinear re- gression models, logistic regression (Wiginton, 1980) and probit regres- sion (Grablowsky and Talley, 1981), also have been applied in credit risk analysis.
Linear programming and integer programming also has found successful application in credit risk assessment arena. The idea is to apply program- ming techniques to choose weight vector, w, so that the weighted sum of the answers w' is above some cutoff value for the good applicants and x below the cutoff value for the bad ones. The classical work of classifica- tion by programming can be found in Mangasarian (1965) and Glover (1990).
86 6 Evaluating Credit Risk with a Bilateral-Weighted Fuzzy SVM Model The nearest-neighbor approach is a standard nonparametric approach to the classification problem. It was applied in credit risk analysis first by Chatterjee and Barcun (1970) and later by Henley and Hand (1996). The idea is to choose a metric on the space of applicant data to measure how far apart any two applicants are. Then, with a sample of past applicants as a representative standard, a new applicant is classified as good or bad de- pending on the proportions of “goods” and “bads” among the k-nearest ap- plicants from the representative sample – the new applicant’s nearest neighbors.
The classification tree, also known as recursive partitioning algorithms (RPA), is a completely different statistical approach to classification. The idea is to split the set of application answers into different sets and then identify each of these sets as good or bad depending on what the majority is in that set. Makowski (1985) and Coffman (1986) firstly applied the method in credit risk analysis.
Neural networks, which were originally developed from attempts to model the communication and processing information in the human brain, are the most promising credit risk analysis models, and have been adopted by many credit risk analysis systems. Neural networks could be developed to classify nonlinearly separable cases using multiple-layer networks with nonlinear transfer functions. But neural networks induce new problems such as overfitting and an opaque mechanism. There is a vast literature of neural network’s application in credit risk analysis. Recent publications on its application in credit risk analysis include Smalz and Conrad (1994), Pi- ramuthu (1999), Malhotra and Malhotra (2002, 2003) and Lai et al.
(2006d).
The genetic algorithm is a procedure of systematically searching through a population of potential solutions to a problem so that candidate solutions that come closer to solving the problem have a greater chance of being retained in the candidate solution than others. Its application in the credit risk analysis field can be found in Varetto (1998) and Chen and Huang (2003).
Rough set theory was introduced by Pawlak (1982). Its philosophy is founded on the assumption that with every object of the universe of dis- course we associate some information (data, knowledge). It can be ap- proached as an extension of the classical set theory, for use when repre- senting incomplete knowledge. Rough sets can be considered as sets with fuzzy boundaries – sets that cannot be precisely characterized using the available set of attributes. Rough set theory can complement other theories that deal with data uncertainty, such as probability theory and fuzzy set theory. Rough set theory is used to discriminate between healthy and fail- ing firms in order to predict business failure in Dimitras et al. (1999).
6.1 Introduction 87 The above is just a partial list of commonly used credit risk analysis methods. Some combined or ensemble classifiers, which integrate two or more single classification methods, have shown higher correctness of pre- dictability than individual methods. For example, Lai et al. (2006b) used a neural network ensemble model for credit risk evaluation. Research into combined classifiers in credit risk analysis is currently flourishing. A good recent survey on credit risk assessment is Thomas (2002).
Support vector machine (SVM) is first proposed by Vapnik (1995, 1998a, 1998b). Unlike classical methods that merely minimize the empiri- cal training error, SVM aims at minimizing an upper bound of the gener- alization error by maximizing the margin between the separating hyper- plane and the data. It is a powerful and promising data classification and function estimation tool. In this method the input vectors are mapped into a higher dimensional feature space and an optimal separating hyperplane in this space is constructed. SVMs have been successfully applied to a num- ber of applications ranging from bioinformatics to text categorization and face or fingerprint identification.
Applications of SVM in credit analysis include Van Gestel et al. (2003a, 2003b), Stecking and Schebesch (2003), Huang et al. (2004), and Lai et al.
(2006a, 2006c). Huang et al. used two datasets from Taiwanese financial institutions and United States’ commercial banks as an experimental test bed to compare the performance of SVM with back propagation neural networks. Their results showed that SVMs achieved accuracy comparable with that of back propagation neural networks. Van Gestel et al. (2003a) and Van Gestel et al. (2003b) compare classical linear rating methods with state-of-the-art SVM techniques. The test results clearly indicate the SVM methodology yielded significantly better results on an out-of-sample test set. And other studies also revealed that SVM is useful for credit risk analysis and evaluation.
One of the main drawbacks in the application of standard SVM is their sensitive to outliers or noises in the training sample due to overfitting as shown in Guyon (1996) and Zhang (1999). In the work of Huang and Liu (2002) and Lin and Wang (2002), fuzzy SVM is proposed to deal with the problem. Each instance will be assigned a membership that is according to its distance from its own class. In fuzzy SVM each instance’s contribution to the total error term in the objective function is weighted by its member- ship, instead of equally 1. Experimental results show that the proposed fuzzy SVM can actually reduce the effect of outliers and yield higher clas- sification rate than traditional SVMs do.
Motivated by Huang and Liu (2002) and Lin and Wang (2002), this chapter presents a new fuzzy SVM, which is called as “bilateral-weighted fuzzy SVM”, to evaluate the credit risk. Unlike the fuzzy SVM described
88 6 Evaluating Credit Risk with a Bilateral-Weighted Fuzzy SVM Model in Huang and Liu (2002) and Lin and Wang (2002), this chapter treats each instance as both of positive and negative classes, but assigned with different memberships. This can also be regarded as constructing two in- stances from the original instance and assigning its memberships of posi- tive and negative classes respectively. If one instance is detected as an out- lier, which means that it is very likely to fall in this class, but actually it falls in the contrary class actually, we treat it as a member of this class with large membership, at the same time treat it as a member of the con- trary class with small membership. The intuition is that we can make more efficient use of the training sample and achieve better generalization abil- ity in such way.
The economic meaning of the bilateral-weighted fuzzy SVM is that the most reliable customer also may default his or her debt and vice verse.
Kwok (1998), Suykens and Vandewalle (1999) and Van Gestel et al.
(2002) provided a probability framework for understanding SVM or least square SVM. Their work mainly focused on automatic adjustment of regu- larization parameter and the kernel parameter to the near optimal.
The bilateral-weighted fuzzy SVM is also different from least squares fuzzy SVM presented in the previous chapter and fuzzy least squares SVM proposed by Tsujinishi and Abe (2003) which is resolved for multi- classification. In their paper, Tsujinishi and Abe defined a membership function in the direction perpendicular to the optimal separating hyper- plane that separates a pair of classes, and then determined the member- ships of each data for each class based on minimum or average operation.
In their each two-class classification computation, the objective function is the same as the basic LS-SVM, while in bilateral-weighted fuzzy SVM, the total error term is weighted by the memberships of the data for its class. In other words Tsujinishi and Abe (2003) used memberships gener- ated the results of pair wise classification by standard LSSVM to deter- mine one instance’s memberships of a class by minimum or average opera- tor, while the bilateral-weighted fuzzy SVM used memberships generated by some credit risk analysis methods to weight the error term in the objec- tive function in SVM.
The rest of the chapter is organized as follows. Section 6.2 describes the formulation of the bilateral-weighted fuzzy SVM in detail. The empirical results of the proposed bilateral-weighted fuzzy SVM model are described in Section 6.3. Section 6.4 concludes the chapter.
6.2 Formulation of the Bilateral-Weighted Fuzzy SVM Model 89
6.2 Formulation of the Bilateral-Weighted Fuzzy SVM