Feature selection for distance-based regression : An umbrella review and a one-shot wrapper
Linja, J., Hämäläinen, J., Nieminen, P., & Kärkkäinen, T. (2023). Feature selection for distance-based regression : An umbrella review and a one-shot wrapper. Neurocomputing, 518, 344-359. https://doi.org/10.1016/j.neucom.2022.11.023
Published in
NeurocomputingDate
2023Discipline
TekniikkaComputing Education ResearchTutkintokoulutusKoulutusteknologia ja kognitiotiedeHuman and Machine based Intelligence in LearningTietotekniikkaEngineeringComputing Education ResearchDegree EducationLearning and Cognitive SciencesHuman and Machine based Intelligence in LearningMathematical Information TechnologyCopyright
© 2022 The Authors. Published by Elsevier B.V.
Feature selection (FS) may improve the performance, cost-efficiency, and understandability of supervised machine learning models. In this paper, FS for the recently introduced distance-based supervised machine learning model is considered for regression problems. The study is contextualized by first providing an umbrella review (review of reviews) of recent development in the research field. We then propose a saliency-based one-shot wrapper algorithm for FS, which is called MAS-FS. The algorithm is compared with a set of other popular FS algorithms, using a versatile set of simulated and benchmark datasets. Finally, experimental results underline the usefulness of FS for regression, confirming the utility of certain filter algorithms and particularly the proposed wrapper algorithm.
Publisher
ElsevierISSN Search the Publication Forum
0925-2312Keywords
Publication in research information system
https://converis.jyu.fi/converis/portal/detail/Publication/160101260
Metadata
Show full item recordCollections
Related funder(s)
Research Council of FinlandFunding program(s)
Others, AoF; Academy Programme, AoFAdditional information about funding
This work has been supported by the Academy of Finland through the projects 315550 (HNP-AI) and 351579 (MLNovCat). We acknowledge grants of computer capacity from the Finnish Grid and Cloud Infrastructure (FCCI; persistent identifier urn:nbn:fi:research-infras-2016072533).License
Related items
Showing items with similar title or keywords.
-
Problem Transformation Methods with Distance-Based Learning for Multi-Target Regression
Hämäläinen, Joonas; Kärkkäinen, Tommi (ESANN, 2020)Multi-target regression is a special subset of supervised machine learning problems. Problem transformation methods are used in the field to improve the performance of basic methods. The purpose of this article is to test ... -
Monte Carlo Simulations of Au38(SCH3)24 Nanocluster Using Distance-Based Machine Learning Methods
Pihlajamäki, Antti; Hämäläinen, Joonas; Linja, Joakim; Nieminen, Paavo; Malola, Sami; Kärkkäinen, Tommi; Häkkinen, Hannu (American Chemical Society, 2020)We present an implementation of distance-based machine learning (ML) methods to create a realistic atomistic interaction potential to be used in Monte Carlo simulations of thermal dynamics of thiolate (SR) protected gold ... -
Do Randomized Algorithms Improve the Efficiency of Minimal Learning Machine?
Linja, Joakim; Hämäläinen, Joonas; Nieminen, Paavo; Kärkkäinen, Tommi (MDPI AG, 2020)Minimal Learning Machine (MLM) is a recently popularized supervised learning method, which is composed of distance-regression and multilateration steps. The computational complexity of MLM is dominated by the solution of ... -
How can algorithms help in segmenting users and customers? : A systematic review and research agenda for algorithmic customer segmentation
Salminen, Joni; Mustak, Mekhail; Sufyan, Muhammad; Jansen, Bernard J. (Palgrave Macmillan, 2023)What algorithm to choose for customer segmentation? Should you use one algorithm or many? How many customer segments should you create? How to evaluate the results? In this research, we carry out a systematic literature ... -
Instance-Based Multi-Label Classification via Multi-Target Distance Regression
Hämäläinen, Joonas; Nieminen, Paavo; Kärkkäinen, Tommi (ESANN, 2021)Interest in multi-target regression and multi-label classification techniques and their applications have been increasing lately. Here, we use the distance-based supervised method, minimal learning machine (MLM), as a base ...