From credit scoring to regulatory scoring: comparing credit scoring models from a regulatory perspective

Yufei Xia; Zijun Liao; Jun Xu; Yinguo Li

doi:10.3846/tede.2022.17045

DOI: https://doi.org/10.3846/tede.2022.17045

Abstract

Conventional credit scoring models evaluated by predictive accuracy or profitability typically serve the financial institutions and can hardly reflect their contribution on financial stability. To remedy this, we develop a novel regulatory scoring framework to quantify and compare the corresponding regulatory capital charge errors of credit scoring models. As an application of RegTech, the proposed framework considers the characteristic of example-dependence and costsensitivity in credit scoring, which is expected to enhance the ability of risk absorption of financial institutions and thus benefit the regulators. Validated on two real-world credit datasets, empirical results reveal that credit scoring models with good predictive accuracy or profitability do not necessarily provide low capital charge requirement error, which further highlights the importance of regulatory scoring framework. The family of gradient boosting decision tree (GBDT) provides significantly better average performance than industry benchmarks and deep multilayer perceptron network, especially when financial stability is the primary focus. To further examine the robustness of the proposed regulatory scoring, sampling techniques, cut-off value modification, and probability calibration are employed within the framework and the main conclusions hold in most cases. Furthermore, the analysis on the interpretability via TreeSHAP algorithm alleviates the concerns on transparency of GBDT-based models, and confirms the important roles of loan characteristics, borrowers’ solvency and creditworthiness as powerful predictors in credit scoring. Finally, the managerial implications for both financial institutions and regulators are discussed.

Keyword : credit scoring, RegTech, regulatory scoring, probability of default, financial regulation, gradient boosting decision tree

How to Cite

Xia, Y., Liao, Z., Xu, J., & Li, Y. (2022). From credit scoring to regulatory scoring: comparing credit scoring models from a regulatory perspective. Technological and Economic Development of Economy, 28(6), 1954–1990. https://doi.org/10.3846/tede.2022.17045

Published in Issue

Dec 15, 2022

Abstract Views

842

PDF Downloads

915

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Ala’raj, M., & Abbod, M. F. (2016a). Classifiers consensus system approach for credit scoring. Knowledge-Based Systems, 104, 89–105. https://doi.org/10.1016/j.knosys.2016.04.013

Ala’raj, M., & Abbod, M. F. (2016b). A new hybrid ensemble credit scoring model based on classifiers consensus system approach. Expert Systems with Applications, 64, 36–55. https://doi.org/10.1016/j.eswa.2016.07.017

Altman, E. I. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance, 23(4), 589–609. https://doi.org/10.1111/j.1540-6261.1968.tb00843.x

Anagnostopoulos, I. (2018). Fintech and regtech: Impact on regulators and banks. Journal of Economics and Business, 100, 7–25. https://doi.org/10.1016/j.jeconbus.2018.07.003

Arner, D. W., Barberis, J., & Buckey, R. P. (2016). FinTech, RegTech, and the reconceptualization of financial regulation. Northwestern Journal of International Law & Business, 37, 371. https://scholarlycommons.law.northwestern.edu/njilb/vol37/iss3/2

Bahnsen, A. C., Aouada, D., & Ottersten, B. (2014, December). Example-dependent cost-sensitive logistic regression for credit scoring. Proceedings of 13th International Conference on Machine Learning and Applications (ICMLA) (pp. 263–269). Detroit, MI, USA. IEEE. https://doi.org/10.1109/ICMLA.2014.48

Bahnsen, A. C., Aouada, D., & Ottersten, B. (2015). Example-dependent cost-sensitive decision trees. Expert Systems with Applications, 42(19), 6609–6619. https://doi.org/10.1016/j.eswa.2015.04.042

Basel Committee on Banking Supervision. (2005). An explanatory note on the Basel II IRB risk weight functions. Bank for International Settlements.

Baxter, L. G. (2016). Adaptive financial regulation and RegTech: A concept article on realistic protection for victims of bank failures. Duke Law Journal, 66(3), 567–604. https://scholarship.law.duke.edu/dlj/vol66/iss3/5

Bellotti, T., & Crook, J. (2009). Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society, 60(12), 1699–1707. https://doi.org/10.1057/jors.2008.130

Bensic, M., Sarlija, N., & Zekic‐Susac, M. (2005). Modelling small‐business credit scoring by using logistic regression, neural networks and decision trees. Intelligent Systems in Accounting, Finance & Management, 13(3), 133–150. https://doi.org/10.1002/isaf.261

Bequé, A., Coussement, K., Gayler, R., & Lessmann, S. (2017). Approaches for credit scorecard calibration: An empirical analysis. Knowledge-Based Systems, 134, 213–227. https://doi.org/10.1016/j.knosys.2017.07.034

Bequé, A., & Lessmann, S. (2017). Extreme learning machines for credit scoring: An empirical evaluation. Expert Systems with Applications, 86, 42–53. https://doi.org/10.1016/j.eswa.2017.05.050

Brown, I., & Mues, C. (2012). An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Systems with Applications, 39(3), 3446–3453. https://doi.org/10.1016/j.eswa.2011.09.033

Chen, N., Ribeiro, B., & Chen, A. (2016). Financial credit risk assessment: A recent review. Artificial Intelligence Review, 45(1), 1–23. https://doi.org/10.1007/s10462-015-9434-x

Chi, B.-W., & Hsu, C.-C. (2012). A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model. Expert Systems with Applications, 39(3), 2650–2661. https://doi.org/10.1016/j.eswa.2011.08.120

Crone, S. F., & Finlay, S. (2012). Instance sampling in credit scoring: An empirical study of sample size and balancing. International Journal of Forecasting, 28(1), 224–238. https://doi.org/10.1016/j.ijforecast.2011.07.006

Crook, J. N., Edelman, D. B., & Thomas, L. C. (2007). Recent developments in consumer credit risk assessment. European Journal of Operational Research, 183(3), 1447–1465. https://doi.org/10.1016/j.ejor.2006.09.100

Dastile, X., Celik, T., & Potsane, M. (2020). Statistical and machine learning models in credit scoring: A systematic literature survey. Applied Soft Computing, 91, 106263. https://doi.org/10.1016/j.asoc.2020.106263

Demma, C. (2017). Credit scoring and the quality of business credit during the crisis. Economic Notes: Review of Banking, Finance and Monetary Economics, 46(2), 269–306. https://doi.org/10.1111/ecno.12080

Duarte, J., Han, X., Harford, J., & Young, L. (2008). Information asymmetry, information dissemination and the effect of regulation FD on the cost of capital. Journal of Financial Economics, 87(1), 24–44. https://doi.org/10.1016/j.jfineco.2006.12.005

Eisenbeis, R. A. (1977). Pitfalls in the application of discriminant analysis in business, finance, and economics. The Journal of Finance, 32(3), 875–900. https://doi.org/10.2307/2326320

Feng, X., Xiao, Z., Zhong, B., Dong, Y., & Qiu, J. (2019). Dynamic weighted ensemble classification for credit scoring using Markov Chain. Applied Intelligence, 49(2), 555–568. https://doi.org/10.1007/s10489-018-1253-8

Finlay, S. (2010). Credit scoring for profitability objectives. European Journal of Operational Research, 202(2), 528–537. https://doi.org/10.1016/j.ejor.2009.05.025

Florez-Lopez, R., & Ramon-Jeronimo, J. M. (2015). Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Systems with Applications, 42(13), 5737–5753. https://doi.org/10.1016/j.eswa.2015.02.042

Gordy, M. B. (2003). A risk-factor model foundation for ratings-based bank capital rules. Journal of Financial Intermediation, 12(3), 199–232. https://doi.org/10.1016/S1042-9573(03)00040-8

Gorton, G., & Ordonez, G. (2014). Collateral crises. American Economic Review, 104(2), 343–378. https://doi.org/10.1257/aer.104.2.343

Gunnarsson, B. R., Vanden Broucke, S., Baesens, B., Óskarsdóttir, M., & Lemahieu, W. (2021). Deep learning for credit scoring: Do or don’t? European Journal of Operational Research, 295(1), 292–305. https://doi.org/10.1016/j.ejor.2021.03.006

Hand, D. J. (2009). Measuring classifier performance: A coherent alternative to the area under the ROC curve. Machine Learning, 77(1), 103–123. https://doi.org/10.1007/s10994-009-5119-5

Hanson, S. G., Kashyap, A. K., & Stein, J. C. (2011). A macroprudential approach to financial regulation. Journal of Economic Perspectives, 25(1), 3–28. https://doi.org/10.1257/jep.25.1.3

He, H., Zhang, W., & Zhang, S. (2018). A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications, 98, 105–117. https://doi.org/10.1016/j.eswa.2018.01.012

Herasymovych, M., Märka, K., & Lukason, O. (2019). Using reinforcement learning to optimize the acceptance threshold of a credit scoring model. Applied Soft Computing, 84, 105697. https://doi.org/10.1016/j.asoc.2019.105697

Hoese, S., & Huschens, S. (2013). Stochastic orders and non-Gaussian risk factor models. Review of Managerial Science, 7(2), 99–140. https://doi.org/10.1007/s11846-011-0071-8

Huang, J., & Ling, C. X. (2005). Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 17(3), 299–310. https://doi.org/10.1109/TKDE.2005.50

Hurlin, C., Leymarie, J., & Patin, A. (2018). Loss functions for Loss Given Default model comparison. European Journal of Operational Research, 268(1), 348–360. https://doi.org/10.1016/j.ejor.2018.01.020

Kadan, O., Madureira, L., Wang, R., & Zach, T. (2009). Conflicts of interest and stock recommendations: The effects of the global settlement and related regulations. The Review of Financial Studies, 22(10), 4189–4217. https://doi.org/10.1093/rfs/hhn109

Kavassalis, P., Stieber, H., Breymann, W., Saxton, K., & Gross, F. J. (2018). An innovative RegTech approach to financial risk monitoring and supervisory reporting. The Journal of Risk Finance, 19(1), 39–55. https://doi.org/10.1108/JRF-07-2017-0111

Lessmann, S., Baesens, B., Seow, H.-V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. European Journal of Operational Research, 247(1), 124–136. https://doi.org/10.1016/j.ejor.2015.05.030

Li, Z., Tian, Y., Li, K., Zhou, F., & Yang, W. (2017). Reject inference in credit scoring using Semi-supervised Support Vector Machines. Expert Systems with Applications, 74, 105–114. https://doi.org/10.1016/j.eswa.2017.01.011

Ling, C. X., & Sheng, V. S. (2011). Cost-sensitive learning. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of machine learning (pp. 231–235): Springer. https://doi.org/10.1007/978-0-387-30164-8_181

Lohmann, C., & Ohliger, T. (2019). The total cost of misclassification in credit scoring: A comparison of generalized linear models and generalized additive models. Journal of Forecasting, 38(5), 375–389. https://doi.org/10.1002/for.2545

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 30, 4765–4774.

Lundberg, S. M., Erion, G. G., & Lee, S.-I. (2018). Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888

Ma, L., Zhao, X., Zhou, Z., & Liu, Y. (2018). A new aspect on P2P online lending default prediction using meta-level phone usage data in China. Decision Support Systems, 111, 60–71. https://doi.org/10.1016/j.dss.2018.05.001

Ma, X., Sha, J., Wang, D., Yu, Y., Yang, Q., & Niu, X. (2018). Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning. Electronic Commerce Research and Applications, 31, 24–39. https://doi.org/10.1016/j.elerap.2018.08.002

Maldonado, S., Peters, G., & Weber, R. (2020). Credit scoring using three-way decisions with probabilistic rough sets. Information Sciences, 507, 700–714. https://doi.org/10.1016/j.ins.2018.08.001

Malekipirbazari, M., & Aksakalli, V. (2015). Risk assessment in social lending via random forests. Expert Systems with Applications, 42(10), 4621–4631. https://doi.org/10.1016/j.eswa.2015.02.001

Marqués, A. I., García, V., & Sánchez, J. S. (2013). On the suitability of resampling techniques for the class imbalance problem in credit scoring. Journal of the Operational Research Society, 64(7), 1060–1070. https://doi.org/10.1057/jors.2012.120

Merton, R. C. (1974). On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance, 29(2), 449–470. https://doi.org/10.2307/2978814

Moosa, I. A. (2010). Basel II as a casualty of the global financial crisis. Journal of Banking Regulation, 11(2), 95–114. https://doi.org/10.1057/jbr.2010.2

Moscato, V., Picariello, A., & Sperlí, G. (2021). A benchmark of machine learning approaches for credit score prediction. Expert Systems with Applications, 165, 113986. https://doi.org/10.1016/j.eswa.2020.113986

Óskarsdóttir, M., Bravo, C., Sarraute, C., Vanthienen, J., & Baesens, B. (2019). The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics. Applied Soft Computing, 74, 26–39. https://doi.org/10.1016/j.asoc.2018.10.004

Papouskova, M., & Hajek, P. (2019). Two-stage consumer credit risk modelling using heterogeneous ensemble learning. Decision Support Systems, 118, 33–45. https://doi.org/10.1016/j.dss.2019.01.002

People’s Bank of China. (2019). China financial stability report 2019. http://www.pbc.gov.cn/en/3688235/3688414/3710021/index.html

Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In A. J. Smola, P. Bartlett, B. Schölkopf, & D. Schuurmans (Eds.), Advances in large-margin classifiers (Vol. 10, pp. 61–74). MIT Press.

Pławiak, P., Abdar, M., & Acharya, U. R. (2019). Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Applied Soft Computing, 84, 105740. https://doi.org/10.1016/j.asoc.2019.105740

Saerens, M., Latinne, P., & Decaestecker, C. (2002). Adjusting the outputs of a classifier to new a priori probabilities: A simple procedure. Neural Computation, 14(1), 21–41. https://doi.org/10.1162/089976602753284446

Schotten, P. C., & Morais, D. C. (2019). A group decision model for credit granting in the financial market. Financial Innovation, 5(1), 1–19. https://doi.org/10.1186/s40854-019-0126-4

Serrano-Cinca, C., & Gutiérrez-Nieto, B. (2016). The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending. Decision Support Systems, 89(2), 113–122. https://doi.org/10.1016/j.dss.2016.06.014

Shen, F., Wang, R., & Shen, Y. (2019). A cost-sensitive logistic regression credit scoring model based on multi-objective optimization approach. Technological and Economic Development of Economy, 1–25. https://doi.org/10.3846/tede.2019.11337

Sun, J., Lang, J., Fujita, H., & Li, H. (2018). Imbalanced enterprise credit evaluation with DTE-SBD: Decision tree ensemble based on SMOTE and bagging with differentiated sampling rates. Information Sciences, 425, 76–91. https://doi.org/10.1016/j.ins.2017.10.017

Tang, L., Cai, F., & Ouyang, Y. (2019). Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in China. Technological Forecasting and Social Change, 144, 563–572. https://doi.org/10.1016/j.techfore.2018.03.007

Thomas, L. C. (2000). A survey of credit and behavioural scoring: Forecasting financial risk of lending to consumers. International Journal of Forecasting, 16(2), 149–172. https://doi.org/10.1016/S0169-2070(00)00034-0

Tsai, C.-F., & Wu, J.-W. (2008). Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Systems with Applications, 34(4), 2639–2649. https://doi.org/10.1016/j.eswa.2007.05.019

Verbraken, T., Bravo, C., Weber, R., & Baesens, B. (2014). Development and application of consumer credit scoring models using profit-based classification measures. European Journal of Operational Research, 238(2), 505–513. https://doi.org/10.1016/j.ejor.2014.04.001

Wiginton, J. C. (1980). A note on the comparison of logit and discriminant models of consumer credit behavior. Journal of Financial and Quantitative Analysis, 15(3), 757–770. https://doi.org/10.2307/2330408

Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893

Xia, Y. (2019). A novel reject inference model using outlier detection and gradient boosting technique in peer-to-peer lending. IEEE Access, 7, 92893–92907. https://doi.org/10.1109/ACCESS.2019.2927602

Xia, Y., He, L., Li, Y., Fu, Y., & Xu, Y. (2021a). A dynamic credit scoring model based on survival gradient boosting decision tree approach. Technological and Economic Development of Economy, 27(1), 96–119. https://doi.org/10.3846/tede.2020.13997

Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020a). Predicting loan default in peer‐to‐peer lending using narrative data. Journal of Forecasting, 39(2), 260–280. https://doi.org/10.1002/for.2625

Xia, Y., Li, Y., He, L., Xu, Y., & Meng, Y. (2021b). Incorporating multilevel macroeconomic variables into credit scoring for online consumer lending. Electronic Commerce Research and Applications, 49, 101095. https://doi.org/10.1016/j.elerap.2021.101095

Xia, Y., Liu, C., Da, B., & Xie, F. (2018a). A novel heterogeneous ensemble credit scoring model based on bstacking approach. Expert Systems with Applications, 93, 182–199. https://doi.org/10.1016/j.eswa.2017.10.022

Xia, Y., Liu, C., & Liu, N. (2017a). Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending. Electronic Commerce Research and Applications, 24, 30–49. https://doi.org/10.1016/j.elerap.2017.06.004

Xia, Y., Liu, C., Li, Y., & Liu, N. (2017b). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241. https://doi.org/10.1016/j.eswa.2017.02.017

Xia, Y., Yang, X., & Zhang, Y. (2018b). A rejection inference technique based on contrastive pessimistic likelihood estimation for P2P lending. Electronic Commerce Research and Applications, 30, 111–124. https://doi.org/10.1016/j.elerap.2018.05.011

Xia, Y., Zhao, J., He, L., Li, Y., & Niu, M. (2020b). A novel tree-based dynamic heterogeneous ensemble method for credit scoring. Expert Systems with Applications, 159, 113615. https://doi.org/10.1016/j.eswa.2020.113615

Xiao, J., Wang, Y., Chen, J., Xie, L., & Huang, J. (2021). Impact of resampling methods and classification models on the imbalanced credit scoring problems. Information Sciences, 569, 508–526. https://doi.org/10.1016/j.ins.2021.05.029

Xiao, J., Zhou, X., Zhong, Y., Xie, L., Gu, X., & Liu, D. (2020). Cost-sensitive semi-supervised selective ensemble model for customer credit scoring. Knowledge-Based Systems, 189, 105118. https://doi.org/10.1016/j.knosys.2019.105118

Xu, D., Zhang, X., & Feng, H. (2019). Generalized fuzzy soft sets theory‐based novel hybrid ensemble credit scoring model. International Journal of Finance & Economics, 24(2), 903–921. https://doi.org/10.1002/ijfe.1698

Yu, L., Li, X., Tang, L., Zhang, Z., & Kou, G. (2015). Social credit: a comprehensive literature review. Financial Innovation, 1(1), 1–18. https://doi.org/10.1186/s40854-015-0005-6

Yu, L., Wang, S., & Lai, K. K. (2008). Credit risk assessment with a multistage neural network ensemble learning approach. Expert Systems with Applications, 34(2), 1434–1444. https://doi.org/10.1016/j.eswa.2007.01.009

Yu, L., Yue, W., Wang, S., & Lai, K. K. (2010). Support vector machine based multiagent ensemble learning for credit risk evaluation. Expert Systems with Applications, 37(2), 1351–1360. https://doi.org/10.1016/j.eswa.2009.06.083