Näytä suppeat kuvailutiedot

dc.contributor.authorNiemelä, Marko
dc.contributor.authorÄyrämö, Sami
dc.contributor.authorKärkkäinen, Tommi
dc.date.accessioned2022-02-01T10:27:23Z
dc.date.available2022-02-01T10:27:23Z
dc.date.issued2022
dc.identifier.citationNiemelä, M., Äyrämö, S., & Kärkkäinen, T. (2022). Toolbox for Distance Estimation and Cluster Validation on Data With Missing Values. <i>IEEE Access</i>, <i>10</i>, 352-367. <a href="https://doi.org/10.1109/ACCESS.2021.3136435" target="_blank">https://doi.org/10.1109/ACCESS.2021.3136435</a>
dc.identifier.otherCONVID_104072467
dc.identifier.urihttps://jyx.jyu.fi/handle/123456789/79601
dc.description.abstractMissing data are unavoidable in the real-world application of unsupervised machine learning, and their nonoptimal processing may decrease the quality of data-driven models. Imputation is a common remedy for missing values, but directly estimating expected distances have also emerged. Because treatment of missing values is rarely considered in clustering related tasks and distance metrics have a central role both in clustering and cluster validation, we developed a new toolbox that provides a wide range of algorithms for data preprocessing, distance estimation, clustering, and cluster validation in the presence of missing values. All these are core elements in any comprehensive cluster analysis methodology. We describe the methodological background of the implemented algorithms and present multiple illustrations of their use. The experiments include validating distance estimation methods against selected reference methods and demonstrating the performance of internal cluster validation indices. The experimental results demonstrate the general usability of the toolbox for the straightforward realization of alternate data processing pipelines. Source code, data sets, results, and example macros are available on GitHub. https://github.com/markoniem/nanclustering_toolboxen
dc.format.mimetypeapplication/pdf
dc.language.isoeng
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartofseriesIEEE Access
dc.rightsCC BY 4.0
dc.subject.othermissing values
dc.subject.otherdistance estimation
dc.subject.otherclustering
dc.subject.othercluster validation
dc.titleToolbox for Distance Estimation and Cluster Validation on Data With Missing Values
dc.typearticle
dc.identifier.urnURN:NBN:fi:jyu-202202011363
dc.contributor.laitosInformaatioteknologian tiedekuntafi
dc.contributor.laitosFaculty of Information Technologyen
dc.contributor.oppiaineTekniikkafi
dc.contributor.oppiaineKoulutusteknologia ja kognitiotiedefi
dc.contributor.oppiaineComputing, Information Technology and Mathematicsfi
dc.contributor.oppiaineHuman and Machine based Intelligence in Learningfi
dc.contributor.oppiaineLaskennallinen tiedefi
dc.contributor.oppiaineEngineeringen
dc.contributor.oppiaineLearning and Cognitive Sciencesen
dc.contributor.oppiaineComputing, Information Technology and Mathematicsen
dc.contributor.oppiaineHuman and Machine based Intelligence in Learningen
dc.contributor.oppiaineComputational Scienceen
dc.type.urihttp://purl.org/eprint/type/JournalArticle
dc.type.coarhttp://purl.org/coar/resource_type/c_2df8fbb1
dc.description.reviewstatuspeerReviewed
dc.format.pagerange352-367
dc.relation.issn2169-3536
dc.relation.volume10
dc.type.versionpublishedVersion
dc.rights.copyright© Authors, 2022
dc.rights.accesslevelopenAccessfi
dc.relation.grantnumber315550
dc.relation.grantnumber311877
dc.subject.ysoalgoritmit
dc.subject.ysodata
dc.subject.ysovalidointi
dc.subject.ysokoneoppiminen
dc.subject.ysoklusterit
dc.subject.ysotietojenkäsittely
dc.subject.ysomallintaminen
dc.subject.ysolaatu
dc.format.contentfulltext
jyx.subject.urihttp://www.yso.fi/onto/yso/p14524
jyx.subject.urihttp://www.yso.fi/onto/yso/p27250
jyx.subject.urihttp://www.yso.fi/onto/yso/p20652
jyx.subject.urihttp://www.yso.fi/onto/yso/p21846
jyx.subject.urihttp://www.yso.fi/onto/yso/p18755
jyx.subject.urihttp://www.yso.fi/onto/yso/p2407
jyx.subject.urihttp://www.yso.fi/onto/yso/p3533
jyx.subject.urihttp://www.yso.fi/onto/yso/p5029
dc.rights.urlhttps://creativecommons.org/licenses/by/4.0/
dc.relation.datasethttps://github.com/markoniem/nanclustering_toolbox
dc.relation.doi10.1109/ACCESS.2021.3136435
dc.relation.funderResearch Council of Finlanden
dc.relation.funderResearch Council of Finlanden
dc.relation.funderSuomen Akatemiafi
dc.relation.funderSuomen Akatemiafi
jyx.fundingprogramAcademy Programme, AoFen
jyx.fundingprogramResearch profiles, AoFen
jyx.fundingprogramAkatemiaohjelma, SAfi
jyx.fundingprogramProfilointi, SAfi
jyx.fundinginformationAcademy of Finland under Grant 311877 (Demo) and Grant 315550.
dc.type.okmA1


Aineistoon kuuluvat tiedostot

Thumbnail

Aineisto kuuluu seuraaviin kokoelmiin

Näytä suppeat kuvailutiedot

CC BY 4.0
Ellei muuten mainita, aineiston lisenssi on CC BY 4.0