Intelligent solutions for real-life data-driven applications
Abstract
The subject of this thesis belongs to the topic of machine learning or, specifically,
to the development of advanced methods for regression analysis, clustering, and
anomaly detection. Industry is constantly seeking improved production practices and minimized production time and costs. In connection to this, several
industrial case studies are presented in which mathematical models for predicting paper quality were proposed. The most important variables for the prediction
models are selected based on information-theoretic measures and regression trees
approach.
The rest of the original papers are devoted to unsupervised machine learning. The main focus is developing advanced spectral clustering techniques for
community detection and anomaly detection. As part of these efforts, a number of enhancements for the dependence clustering algorithm have been proposed. These enhancements include adding regularization for controlling the
size of clusters, extension to the ensemble version for improving model stability, handling overlapping clusters, and adaptation to solving anomaly detection
problems and handling big datasets.
Another focus of the thesis is on developing anomaly detection algorithms
for network security data. In connection to this, a probabilistic transition-based
approach is proposed for detecting application-layer distributed denial-of-service
attacks.
The developed approaches are tested on real datasets and are capable of efficiently solving the given tasks with high accuracy and good performance. They
are shown to be applicable to solving variable selection, graph segmentation, and
anomaly detection tasks in different applications.
Main Author
Format
Theses
Doctoral thesis
Published
2017
Series
Subjects
ISBN
978-951-39-7279-0
Publisher
University of Jyväskylä
The permanent address of the publication
https://urn.fi/URN:ISBN:978-951-39-7279-0Use this for linking
ISSN
1456-5390
Language
English
Published in
Jyväskylä studies in computing