Intelligent solutions for real-life data-driven applications

Abstract
The subject of this thesis belongs to the topic of machine learning or, specifically, to the development of advanced methods for regression analysis, clustering, and anomaly detection. Industry is constantly seeking improved production practices and minimized production time and costs. In connection to this, several industrial case studies are presented in which mathematical models for predicting paper quality were proposed. The most important variables for the prediction models are selected based on information-theoretic measures and regression trees approach. The rest of the original papers are devoted to unsupervised machine learning. The main focus is developing advanced spectral clustering techniques for community detection and anomaly detection. As part of these efforts, a number of enhancements for the dependence clustering algorithm have been proposed. These enhancements include adding regularization for controlling the size of clusters, extension to the ensemble version for improving model stability, handling overlapping clusters, and adaptation to solving anomaly detection problems and handling big datasets. Another focus of the thesis is on developing anomaly detection algorithms for network security data. In connection to this, a probabilistic transition-based approach is proposed for detecting application-layer distributed denial-of-service attacks. The developed approaches are tested on real datasets and are capable of efficiently solving the given tasks with high accuracy and good performance. They are shown to be applicable to solving variable selection, graph segmentation, and anomaly detection tasks in different applications.
Main Author
Format
Theses Doctoral thesis
Published
2017
Series
Subjects
ISBN
978-951-39-7279-0
Publisher
University of Jyväskylä
The permanent address of the publication
https://urn.fi/URN:ISBN:978-951-39-7279-0Use this for linking
ISSN
1456-5390
Language
English
Published in
Jyväskylä studies in computing
License
In CopyrightOpen Access

Share