Robust Principal Component Analysis of Data with Missing Values
Kärkkäinen, T., & Saarela, M. (2015). Robust Principal Component Analysis of Data with Missing Values. In P. Perner (Ed.), Machine Learning and Data Mining in Pattern Recognition : Proceedings of the 11th International Conference, MLDM 2015, Hamburg, Germany, July 20-21, 2015 (pp. 140-154). Lecture Notes in Computer Science (9166). Springer International Publishing. doi:10.1007/978-3-319-21024-7_10
Published inLecture Notes in Computer Science ;9166
© Springer International Publishing Switzerland 2015. This is a final draft version of an article whose final and definitive form has been published by Springer. Published in this repository with the kind permission of the publisher.
Principal component analysis is one of the most popular machine learning and data mining techniques. Having its origins in statistics, principal component analysis is used in numerous applications. However, there seems to be not much systematic testing and assessment of principal component analysis for cases with erroneous and incomplete data. The purpose of this article is to propose multiple robust approaches for carrying out principal component analysis and, especially, to estimate the relative importances of the principal components to explain the data variability. Computational experiments are first focused on carefully designed simulated tests where the ground truth is known and can be used to assess the accuracy of the results of the different methods. In addition, a practical application and evaluation of the methods for an educational data set is given.
PublisherSpringer International Publishing
Is part of publicationMachine Learning and Data Mining in Pattern Recognition : Proceedings of the 11th International Conference, MLDM 2015, Hamburg, Germany, July 20-21, 2015. Edited by P. Perner. Lecture Notes in Computer Science (9166). Springer International Publishing. ISBN 978-3-319-21024-7