Dynamic credit scoring using payment prediction
Credit scoring is a common tool used by lenders in credit risk management. However, recent credit scoring methods are error-prone. Failures from credit scoring will significantly affect the next process, which is payment collection from customers. Bad customers, who are incorrectly approved by credit scoring, end up making payments that, are overdue. In this dissertation, we propose a solution for pre-empting overdue payment as well as improving credit scoring performance. Firstly, we utilize data mining algorithms including Logistic Regression, C4.5, and Bayesian Network to construct payment predictions that can quickly find overdue payments in advance. By utilizing payment prediction, customers who may make overdue payments will be known by the lender earlier. As a result, the lender can proactively approach such customers to pay their payments on schedule. The second solution is to define a refined scoring model that will use feedback from the payment prediction models to improve the initial credit scoring mechanism. The payment prediction result will give information to review the combinations of current credit scoring parameters that work inappropriately. By updating the current credit scoring parameters, the performance of credit scoring is expected to increase significantly. As a result, this mechanism will create a dynamic credit scoring model. We also investigate the impact of the imbalanced data problem on the payment prediction process. We employ data segmentation as a tool to overcome the problem of imbalanced data. By using a novel technique of data segmentation, which we call Majority Bad Payment Segments (MBPS), learning bad payments become much easier. The results of our experiments show that payment prediction based on MBPS produces much higher performance when compared to conventional methods of dealing with imbalanced data. We perform extensive experimentation and evaluation with a variety of metrics such as Hit Rates, Cost Coverage, F-measure, and the Area under Curve measure.