Feature Ranking of Large, Robust, and Weighted Clustering Result

Saarela, Mirka; Hämäläinen, Joonas; Kärkkäinen, Tommi

doi:10.1007/978-3-319-57454-7_8

dc.contributor.author	Saarela, Mirka
dc.contributor.author	Hämäläinen, Joonas
dc.contributor.author	Kärkkäinen, Tommi
dc.contributor.editor	Jinho, Kim
dc.contributor.editor	Kyuseok, Shim
dc.contributor.editor	Longbing, Cao
dc.contributor.editor	Jae-Gil, Lee
dc.contributor.editor	Xuemin, Lin
dc.contributor.editor	Yang-Sae, Moon
dc.date.accessioned	2017-05-11T10:15:43Z
dc.date.available	2017-05-11T10:15:43Z
dc.date.issued	2017
dc.identifier.citation	Saarela, M., Hämäläinen, J., & Kärkkäinen, T. (2017). Feature Ranking of Large, Robust, and Weighted Clustering Result. In K. Jinho, S. Kyuseok, C. Longbing, L. Jae-Gil, L. Xuemin, & M. Yang-Sae (Eds.), <i>Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I</i> (pp. 96-109). Springer International Publishing. Lecture Notes in Computer Science, 10234. <a href="https://doi.org/10.1007/978-3-319-57454-7_8" target="_blank">https://doi.org/10.1007/978-3-319-57454-7_8</a>
dc.identifier.other	CONVID_26981996
dc.identifier.other	TUTKAID_73663
dc.identifier.uri	https://jyx.jyu.fi/handle/123456789/53899
dc.description.abstract	A clustering result needs to be interpreted and evaluated for knowledge discovery. When clustered data represents a sample from a population with known sample-to-population alignment weights, both the clustering and the evaluation techniques need to take this into account. The purpose of this article is to advance the automatic knowledge discovery from a robust clustering result on the population level. For this purpose, we derive a novel ranking method by generalizing the computation of the Kruskal-Wallis H test statistic from sample to population level with two different approaches. Application of these enlargements to both the input variables used in clustering and to metadata provides automatic determination of variable ranking that can be used to explain and distinguish the groups of population. The ranking method is illustrated with an open data and then, applied to advance the educational knowledge discovery from large scale international student assessment data, whose robust clustering into disjoint groups on three different levels of abstraction was performed in [19].
dc.format.extent	778
dc.language.iso	eng
dc.publisher	Springer International Publishing
dc.relation.ispartof	Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I
dc.relation.ispartofseries	Lecture Notes in Computer Science
dc.subject.other	population analysis
dc.subject.other	Kruskal-Wallis test
dc.subject.other	robust clustering
dc.subject.other	educational knowledge discovery
dc.title	Feature Ranking of Large, Robust, and Weighted Clustering Result
dc.type	conferenceObject
dc.identifier.urn	URN:NBN:fi:jyu-201705022143
dc.contributor.laitos	Informaatioteknologian tiedekunta	fi
dc.contributor.laitos	Faculty of Information Technology	en
dc.contributor.oppiaine	Tietotekniikka	fi
dc.contributor.oppiaine	Mathematical Information Technology	en
dc.type.uri	http://purl.org/eprint/type/ConferencePaper
dc.date.updated	2017-05-02T12:15:06Z
dc.relation.isbn	978-3-319-57453-0
dc.type.coar	http://purl.org/coar/resource_type/c_5794
dc.description.reviewstatus	peerReviewed
dc.format.pagerange	96-109
dc.relation.issn	0302-9743
dc.type.version	acceptedVersion
dc.rights.copyright	© Springer International Publishing AG 2017. This is a final draft version of an article whose final and definitive form has been published by Springer. Published in this repository with the kind permission of the publisher.
dc.rights.accesslevel	openAccess	fi
dc.relation.conference	Pacific-Asia Conference on Knowledge Discovery and Data Mining
dc.relation.doi	10.1007/978-3-319-57454-7_8
dc.type.okm	A4

Files in this item

Name:: saahamkarpakkd2017.pdf
Size:: 1.001Mb
Format:: PDF
Description:: Final Draft

View/Open

This item appears in the following Collection(s)

Informaatioteknologian tiedekunta [2144]

Show simple item record

Feature Ranking of Large, Robust, and Weighted Clustering Result

Files in this item

This item appears in the following Collection(s)

Related items

Application of a Knowledge Discovery Process to Study Instances of Capacitated Vehicle Routing Problems ﻿

Feature extraction for supervised learning in knowledge discovery systems ﻿

Patient satisfaction : results of cluster analysis of finnish patients ﻿

Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering ﻿

Clustering Incomplete Spectral Data with Robust Methods ﻿

Application of a Knowledge Discovery Process to Study Instances of Capacitated Vehicle Routing Problems

Feature extraction for supervised learning in knowledge discovery systems

Patient satisfaction : results of cluster analysis of finnish patients

Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering

Clustering Incomplete Spectral Data with Robust Methods