Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods

Niemelä, Marko; Kärkkäinen, Tommi

doi:10.1007/978-3-030-70787-3_9

acceptedVersion

Katso/Avaa

561.3 Kb

Lataukset:

Show download details Hide download details

Niemelä, M., & Kärkkäinen, T. (2022). Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods. In T. T. Tuovinen, J. Periaux, & P. Neittaanmäki (Eds.), Computational Sciences and Artificial Intelligence in Industry : New Digital Technologies for Solving Future Societal and Economical Challenges (pp. 123-133). Springer. Intelligent Systems, Control and Automation: Science and Engineering, 76. https://doi.org/10.1007/978-3-030-70787-3_9

Päivämäärä

2022

Tekijänoikeudet

Missing data introduces a challenge in the field of unsupervised learning. In clustering, when the form and the number of clusters are to be determined, one needs to deal with the missing values both in the clustering process and in the cluster validation. In the previous research, the clustering algorithm has been treated using robust clustering methods and available data strategy, and the cluster validation indices have been computed with the partial distance approximation. However, lately special methods for distance estimation with missing values have been proposed and this work is the first one where these methods are systematically applied and tested in clustering and cluster validation. More precisely, we propose, implement, and analyze the use of distance estimation methods to improve the discrimination power of clustering and cluster validation indices. A novel, robust prototype-based clustering process in two stages is suggested. Our results and conclusions confirm the usefu ... showmore

Julkaisija

Springer

Emojulkaisun ISBN

978-3-030-70786-6

Kuuluu julkaisuun

Computational Sciences and Artificial Intelligence in Industry : New Digital Technologies for Solving Future Societal and Economical Challenges

ISSN Hae Julkaisufoorumista

2213-8986

Asiasanat

koneoppiminen algoritmit klusterianalyysi

DOI

https://doi.org/10.1007/978-3-030-70787-3_9

URI

http://urn.fi/URN:NBN:fi:jyu-202212205765

Julkaisu tutkimustietojärjestelmässä

https://converis.jyu.fi/converis/portal/detail/Publication/100292105

Metadata

Näytä kaikki kuvailutiedot

Kokoelmat

Informaatioteknologian tiedekunta [2330]

Rahoittaja(t)

Suomen Akatemia

Rahoitusohjelmat(t)

Akatemiaohjelma, SA; Profilointi, SA

Lisätietoja rahoituksesta

The authors would like to thank the Academy of Finland for the financial support (grants 311877 and 315550).

Lisenssi