Comparison of cluster validation indices with missing data

Abstract
Clustering is an unsupervised machine learning technique, which aims to divide a given set of data into subsets. The number of hidden groups in cluster analysis is not always obvious and, for this purpose, various cluster validation indices have been suggested. Recently some studies reviewing validation indices have been provided, but any experiments against missing data are not yet available. In this paper, performance of ten well-known indices on ten synthetic data sets with various ratios of missing values is measured using squared euclidean and city block distances based clustering. The original indices are modified for a city block distance in a novel way. Experiments illustrate the different degree of stability for the indices with respect to the missing data.
Main Authors
Format
Conferences Conference paper
Published
2018
Subjects
Publication in research information system
Publisher
ESANN
Original source
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-16.pdf
The permanent address of the publication
https://urn.fi/URN:NBN:fi:jyu-201901281318Käytä tätä linkitykseen.
Parent publication ISBN
978-2-87587-047-6
Review status
Peer reviewed
Conference
European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Language
English
Is part of publication
ESANN 2018 : Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
Citation
  • Niemelä, M., Äyrämö, S., & Kärkkäinen, T. (2018). Comparison of cluster validation indices with missing data. In ESANN 2018 : Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (pp. 461-466). ESANN. https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-16.pdf
License
In CopyrightOpen Access
Funder(s)
Academy of Finland
Academy of Finland
Funding program(s)
Akatemiaohjelma, SA
Profilointi, SA
Academy Programme, AoF
Research profiles, AoF
Academy of Finland
Additional information about funding
The work has been supported by the Academy of Finland from the project 311737 (DysGeBra). The work has been supported by the Academy of Finland from the projects 311877 (Demo) and 315550 (HNP-AI)
Copyright© Authors, 2018

Share