University of Jyväskylä | JYX Digital Repository

  • English  | Give feedback |
    • suomi
    • English
 
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.
View Item 
  • JYX
  • Opinnäytteet
  • Väitöskirjat
  • View Item
JYX > Opinnäytteet > Väitöskirjat > View Item

Improvements and applications of the elements of prototype-based clustering

Thumbnail
View/Open
5.3Mb

Downloads:  
Show download detailsHide download details  
Published in
JYU dissertations
Authors
Hämäläinen, Joonas
Date
2018
Discipline
Tietotekniikka

 
Clustering or cluster analysis is an essential part of data mining, machine learning, and pattern recognition. The most popularly applied clustering methods are partitioning-based or prototype-based methods. Prototype-based clustering methods usually have easy implementability and good scalability. These methods, such as K-means clustering, have been used for different applications in various fields. On the other hand, prototype-based clustering methods are typically sensitive to initialization, and the selection of the number of clusters for knowledge discovery purposes is not straightforward. In the era of big data, in high-velocity, ever-growing datasets, which can also be erroneous, outlier intensive and sparse, research has arisen focused on the development of efficient prototype-based clustering methods for more challenging datasets. This collection of articles primarily focuses on developing prototype-based clustering for more scalable, efficient and reliable data processing. To achieve these goals, improvements and modifications have been made to prototype-based clustering in six included articles. Additionally an application of the prototype-based clustering to supervised learning in regression problems is also covered. In general, these efforts advance the knowledge discovery process towards more reliable data processing and big data. ...
 
Klusterointi eli klusterianalyysi on keskeinen osa-alue tiedonlouhinnassa, koneoppimisessa ja hahmontunnistuksessa. Sovelluksissa käytetyimpiä ovat osittavat eli prototyyppipohjaiset klusterointimenetelmät. Prototyyppipohjaiset klusterointimenetelmät ovat usein helposti toteutettavissa ja ne skaalautuvat hyvin. Näitä menetelmiä, kuten K-means-klusterointia, on hyödynnetty monissa eri sovelluksissa eri tutkimusaloilla. Toisaalta prototyyppipohjaiset klusterointimenetelmät ovat alustukselle herkkiä eikä klustereiden lukumäärän valinta ole suoraviivaista. Big datan aikakaudella nopeasti kasvavat tietomassat, jotka voivat myös olla virheellisiä, anomaliaintensiivisiä ja harvoja, ohjaavat tutkimusta tehokkaiden prototyyppipohjaisten klusterointimenetelmien kehittämiseen haastaville datajoukoille. Tämä artikkeliväitöskirja keskittyy pääasiassa kehittämään datan prosessointia prototyyppipohjaisella klusteroinnilla skaalautuvammaksi, tehokkaammaksi ja luotettavammaksi. Näiden tavoitteiden saavuttamiseksi kuudessa väitöskirjaan kuuluvassa artikkelissa on tehty parannuksia ja modifikaatioita prototyyppipohjaiseen klusterointiin. Lisäksi prototyyppipohjaisen klusteroinnin sovellusta ohjattuun oppimiseen regressio-ongelmissa on käsitelty yhdessä artikkelissa. Yleisesti väitöskirjan tulokset kehittävät tietämyksen muodostamisprosessia kohti luotettavampaa datan prosessointia ja skaalautuvampaa big datan prosessointia. ...
 
Publisher
Jyväskylän yliopisto
ISBN
978-951-39-7621-7
ISSN Search the Publication Forum
2489-9003
Contains publications
  • Artikkeli I: Hämäläinen, J., & Kärkkäinen, T. (2016). Initialization of Big Data Clustering using Distributionally Balanced Folding. In ESANN 2016 : Proceedings of the 24th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (pp. 587-592). ESANN. https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2016-93.pdf
  • Artikkeli II: Joonas Hämäläinen, Tommi Kärkkäinen and Tuomo Rossi. Scalable initial-ization methods for clustering large datasets. Pattern Recognition Letters (in revision), 2018
  • Artikkeli III: Hämäläinen, J., Kärkkäinen, T., & Rossi, T. (2018). Scalable robust clustering method for large and sparse data. In ESANN 2018 : Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (pp. 449-454). https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2018-134.pdf
  • Artikkeli IV: Hämäläinen, J., Jauhiainen, S., & Kärkkäinen, T. (2017). Comparison of Internal Clustering Validation Indices for Prototype-Based Clustering. Algorithms, 10 (3), 105. DOI: 10.3390/a10030105
  • Artikkeli V: Saarela, M., Hämäläinen, J., & Kärkkäinen, T. (2017). Feature Ranking of Large, Robust, and Weighted Clustering Result. In K. Jinho, S. Kyuseok, C. Longbing, L. Jae-Gil, L. Xuemin, & M. Yang-Sae (Eds.), Advances in Knowledge Discovery and Data Mining : 21st Pacific-Asia Conference, PAKDD 2017, Jeju, South Korea, May 23-26, 2017, Proceedings, Part I (pp. 96-109). Springer International Publishing. DOI: 10.1007/978-3-319-57454-7_8
  • Artikkeli VI: Joonas Hämäläinen, Tommi Kärkkäinen and João P. P. Gomes. Clustering-Based Reference Points Selection for the Minimal Learning Machine. Manuscript, 2018.
Keywords
knowledge discovery data mining machine learning prototype-based clustering big data parallel computing robust clustering clustering initialization K-means minimal learning machine random projection tiedonlouhinta koneoppiminen klusterianalyysi rinnakkaiskäsittely
URI

http://urn.fi/URN:ISBN:978-951-39-7621-7

Metadata
Show full item record
Collections
  • JYU Dissertations [127]
  • Väitöskirjat [3032]

Related items

Showing items with similar title or keywords.

  • Application of a Knowledge Discovery Process to Study Instances of Capacitated Vehicle Routing Problems 

    Kärkkäinen, Tommi; Rasku, Jussi (Springer, 2020)
    Vehicle Routing Problems (VRP) are computationally challenging, constrained optimization problems, which have central role in logistics management. Usually different solvers are being developed and applied for different ...
  • Intrusion detection applications using knowledge discovery and data mining 

    Juvonen, Antti (University of Jyväskylä, 2014)
  • Unstable feature relevance in classification tasks 

    Skrypnyk, Iryna (University of Jyväskylä, 2011)
  • On data mining applications in mobile networking and network security 

    Zolotukhin, Mikhail (University of Jyväskylä, 2014)
  • Intelligent solutions for real-life data-driven applications 

    Ivannikova, Elena (University of Jyväskylä, 2017)
    The subject of this thesis belongs to the topic of machine learning or, specifically, to the development of advanced methods for regression analysis, clustering, and anomaly detection. Industry is constantly seeking ...
  • Browse materials
  • Browse materials
  • Articles
  • Conferences and seminars
  • Electronic books
  • Historical maps
  • Journals
  • Tunes and musical notes
  • Photographs
  • Presentations and posters
  • Publication series
  • Research reports
  • Research data
  • Study materials
  • Theses

Browse

All of JYXCollection listBy Issue DateAuthorsSubjectsPublished inDepartmentDiscipline

My Account

Login

Statistics

View Usage Statistics
  • How to publish in JYX?
  • Self-archiving
  • Publish Your Thesis Online
  • Publishing Your Dissertation
  • Publication services

Open Science at the JYU
 
Data Protection Description

Accessibility Statement

Unless otherwise specified, publicly available JYX metadata (excluding abstracts) may be freely reused under the CC0 waiver.
Open Science Centre