Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
Authors
Date
2021Copyright
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Predicting ‘default’ behavior of borrowers is quite challenging and time consuming, although financial institutions require faster and more reliable decision on loan applications to survive in the competitive market. Availability of huge amount of data makes the work of current credit scoring system harder. To deal with such situation machine learning engineers are trying to build a system that can predict default behavior of a borrower by analyzing application and transaction data. In our current study we applied different machine learning models such as decision tree, logistic regression, gradient boosting, XGBoosting, support vector machine and KNeighbors on transactional dataset to find which model performed better. We also applied deep neural network on the datasets. To further extend the study, we created new features by using manual process and unsupervised machine learning to observe whether they boost the performance or not. In addition to that, we used feature selection to see how it affected the prediction. Due to small dataset, we achieved 70% ac-curacy with 72% AUC on aggregated dataset from Random Forest. The dataset created by using unsupervised machine learning showed 62% accuracy with 68% AUC value. Manually created ratio-based features and feature selection could not yield any significant difference in results. Deep learning also per-formed lower than others probably due to small dataset.
...


Keywords
Metadata
Show full item recordCollections
- Pro gradu -tutkielmat [24525]
Related items
Showing items with similar title or keywords.
-
Description of movement sensor dataset for dog behavior classification
Vehkaoja, Antti; Somppi, Sanni; Törnqvist, Heini; Valldeoriola Cardó, Anna; Kumpulainen, Pekka; Väätäjä, Heli; Majaranta, Päivi; Surakka, Veikko; Kujala, Miiamaaria V.; Vainio, Outi (Elsevier, 2022)Movement sensor data from seven static and dynamic dog behaviors (sitting, standing, lying down, trotting, walking, playing, and (treat) searching i.e. sniffing) was collected from 45 middle to large sized dogs with six ... -
CCTVCV : Computer Vision model/dataset supporting CCTV forensics and privacy applications
Turtiainen, Hannu; Costin, Andrei; Hämäläinen, Timo; Lahtinen, Tuomo; Sintonen, Lauri (IEEE, 2022)The increased, widespread, unwarranted, and unaccountable use of Closed-Circuit TeleVision (CCTV) cameras globally has raised concerns about privacy risks for the last several decades. Recent technological advances implemented ... -
Predicting physical activity change in cancer survivors : an application of the Health Action Process Approach
Hardcastle, Sarah J.; Maxwell-Smith, Chloe; Hagger, Martin S. (Springer, 2022)Purpose Previous research has not examined the utility of the Health Action Process Approach (HAPA) to predict physical activity (PA) change in cancer survivors. The aim of the study was to investigate the efficacy of a ... -
Improvements and applications of the elements of prototype-based clustering
Hämäläinen, Joonas (Jyväskylän yliopisto, 2018)Clustering or cluster analysis is an essential part of data mining, machine learning, and pattern recognition. The most popularly applied clustering methods are partitioning-based or prototype-based methods. Prototype-based ... -
Predicting ACL Injury Using Machine Learning on Data From an Extensive Screening Test Battery of 880 Female Elite Athletes
Jauhiainen, Susanne; Kauppi, Jukka-Pekka; Krosshaug, Tron; Bahr, Roald; Bartsch, Julia; Äyrämö, Sami (SAGE Publications, 2022)Background: Injury risk prediction is an emerging field in which more research is needed to recognize the best practices for accurate injury risk assessment. Important issues related to predictive machine learning need to ...