Scalable robust clustering method for large and sparse data
Hämäläinen, J., Kärkkäinen, T., & Rossi, T. (2018). Scalable robust clustering method for large and sparse data. In ESANN 2018 : Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (pp. 449-454). ESANN. https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-134.pdf
Date
2018Copyright
© Authors, 2018
Datasets for unsupervised clustering can be large and sparse, with significant portion of missing values. We present here a scalable version of a robust clustering method with the available data strategy. Moreprecisely, a general algorithm is described and the accuracy and scalability of a distributed implementation of the algorithm is tested. The obtained results allow us to conclude the viability of the proposed approach.
Publisher
ESANNParent publication ISBN
978-2-87587-047-6Conference
European Symposium on Artificial Neural Networks, Computational Intelligence and Machine LearningIs part of publication
ESANN 2018 : Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine LearningKeywords
Original source
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-134.pdfPublication in research information system
https://converis.jyu.fi/converis/portal/detail/Publication/28889218
Metadata
Show full item recordCollections
Related funder(s)
Academy of FinlandFunding program(s)
Research profiles, AoF; Academy Programme, AoFAdditional information about funding
The work of TK has been supported by the Academy of Finland from the projects 311877 (Demo) and 315550 (HNP-AI)License
Related items
Showing items with similar title or keywords.
-
Improvements and applications of the elements of prototype-based clustering
Hämäläinen, Joonas (Jyväskylän yliopisto, 2018) -
Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods
Niemelä, Marko; Kärkkäinen, Tommi (Springer, 2022)Missing data introduces a challenge in the field of unsupervised learning. In clustering, when the form and the number of clusters are to be determined, one needs to deal with the missing values both in the clustering ... -
Clustering Incomplete Spectral Data with Robust Methods
Äyrämö, Sami; Pölönen, Ilkka; Eskelinen, Matti (International Society for Photogrammetry and Remote Sensing, 2017)Missing value imputation is a common approach for preprocessing incomplete data sets. In case of data clustering, imputation methods may cause unexpected bias because they may change the underlying structure of the data. ... -
Improving Scalable K-Means++
Hämäläinen, Joonas; Kärkkäinen, Tommi; Rossi, Tuomo (MDPI AG, 2021)Two new initialization methods for K-means clustering are proposed. Both proposals are based on applying a divide-and-conquer approach for the K-means‖ type of an initialization strategy. The second proposal also uses ... -
Towards Evidence-Based Academic Advising Using Learning Analytics
Gavriushenko, Mariia; Saarela, Mirka; Kärkkäinen, Tommi (Springer, 2018)Academic advising is a process between the advisee, adviser and the academic institution which provides the degree requirements and courses contained in it. Content-wise planning and management of the student’ study ...