Feature Ranking of Large, Robust, and Weighted Clustering Result

Saarela, Mirka; Hämäläinen, Joonas; Kärkkäinen, Tommi

doi:10.1007/978-3-319-57454-7_8

Final Draft

Katso/Avaa

1.0 Mb

Lataukset:

Show download details Hide download details

Saarela, M., Hämäläinen, J., & Kärkkäinen, T. (2017). Feature Ranking of Large, Robust, and Weighted Clustering Result. In K. Jinho, S. Kyuseok, C. Longbing, L. Jae-Gil, L. Xuemin, & M. Yang-Sae (Eds.), Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I (pp. 96-109). Springer International Publishing. Lecture Notes in Computer Science, 10234. https://doi.org/10.1007/978-3-319-57454-7_8

Päivämäärä

2017

Oppiaine

Tietotekniikka Mathematical Information Technology

Tekijänoikeudet

© Springer International Publishing AG 2017. This is a final draft version of an article whose final and definitive form has been published by Springer. Published in this repository with the kind permission of the publisher.

A clustering result needs to be interpreted and evaluated for knowledge discovery. When clustered data represents a sample from a population with known sample-to-population alignment weights, both the clustering and the evaluation techniques need to take this into account. The purpose of this article is to advance the automatic knowledge discovery from a robust clustering result on the population level. For this purpose, we derive a novel ranking method by generalizing the computation of the Kruskal-Wallis H test statistic from sample to population level with two different approaches. Application of these enlargements to both the input variables used in clustering and to metadata provides automatic determination of variable ranking that can be used to explain and distinguish the groups of population. The ranking method is illustrated with an open data and then, applied to advance the educational knowledge discovery from large scale international student assessment data, wh ... showmore

Julkaisija

Springer International Publishing

Emojulkaisun ISBN

978-3-319-57453-0

Konferenssi

Pacific-Asia Conference on Knowledge Discovery and Data Mining

Kuuluu julkaisuun

Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I

ISSN Hae Julkaisufoorumista

0302-9743

Asiasanat

population analysis Kruskal-Wallis test robust clustering educational knowledge discovery

DOI

https://doi.org/10.1007/978-3-319-57454-7_8

URI

http://urn.fi/URN:NBN:fi:jyu-201705022143

Julkaisu tutkimustietojärjestelmässä

https://converis.jyu.fi/converis/portal/detail/Publication/26981996

Metadata

Näytä kaikki kuvailutiedot

Kokoelmat

Informaatioteknologian tiedekunta [2330]