Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset

Abstract
Predicting ‘default’ behavior of borrowers is quite challenging and time consuming, although financial institutions require faster and more reliable decision on loan applications to survive in the competitive market. Availability of huge amount of data makes the work of current credit scoring system harder. To deal with such situation machine learning engineers are trying to build a system that can predict default behavior of a borrower by analyzing application and transaction data. In our current study we applied different machine learning models such as decision tree, logistic regression, gradient boosting, XGBoosting, support vector machine and KNeighbors on transactional dataset to find which model performed better. We also applied deep neural network on the datasets. To further extend the study, we created new features by using manual process and unsupervised machine learning to observe whether they boost the performance or not. In addition to that, we used feature selection to see how it affected the prediction. Due to small dataset, we achieved 70% ac-curacy with 72% AUC on aggregated dataset from Random Forest. The dataset created by using unsupervised machine learning showed 62% accuracy with 68% AUC value. Manually created ratio-based features and feature selection could not yield any significant difference in results. Deep learning also per-formed lower than others probably due to small dataset.
Main Author
Format
Theses Master thesis
Published
2021
Subjects
The permanent address of the publication
https://urn.fi/URN:NBN:fi:jyu-202104162383Use this for linking
Language
English
License
In CopyrightOpen Access

Share