Knowledge mining of unstructured information : application to cyber domain
Abstract
Information on cyber-related crimes, incidents, and conflicts is abundantly available in numerous open online sources. However, processing large volumes and streams of data is a challenging task for the analysts and experts, and entails the need for newer methods and techniques. In this article we present and implement a novel knowledge graph and knowledge mining framework for extracting the relevant information from free-form text about incidents in the cyber domain. The computational framework includes a machine learning-based pipeline for generating graphs of organizations, countries, industries, products and attackers with a non-technical cyber-ontology. The extracted knowledge graph is utilized to estimate the incidence of cyberattacks within a given graph configuration. We use publicly available collections of real cyber-incident reports to test the efficacy of our methods. The knowledge extraction is found to be sufficiently accurate, and the graph-based threat estimation demonstrates a level of correlation with the actual records of attacks. In practical use, an analyst utilizing the presented framework can infer additional information from the current cyber-landscape in terms of the risk to various entities and its propagation between industries and countries.
Main Authors
Format
Articles
Research article
Published
2023
Series
Subjects
Publication in research information system
Publisher
Nature Publishing Group
The permanent address of the publication
https://urn.fi/URN:NBN:fi:jyu-202302061640Käytä tätä linkitykseen.
Review status
Peer reviewed
ISSN
2045-2322
DOI
https://doi.org/10.1038/s41598-023-28796-6
Language
English
Published in
Scientific Reports
Citation
- Takko, T., Bhattacharya, K., Lehto, M., Jalasvirta, P., Cederberg, A., & Kaski, K. (2023). Knowledge mining of unstructured information : application to cyber domain. Scientific Reports, 13, Article 1714. https://doi.org/10.1038/s41598-023-28796-6
Additional information about funding
TT, KB, ML and KK acknowledge research project funding from Cyberwatch Finland. TT acknowledges funding from the Vilho, Yrjö and Kalle Väisälä Foundation of the Finnish Academy of Science and Letters.
Copyright© The Author(s) 2023