Disentangling homonyms- using artificial neural networks to separate the cream from the crop in large text corpora
Roll, U., Correia, R. and Berger-Tal, O. (2018). Disentangling homonyms- using artificial neural networks to separate the cream from the crop in large text corpora. 5th European Congress of Conservation Biology. doi: 10.17011/conference/eccb2018/107550
Date
2018Copyright
© the Authors, 2018
Recent years have seen a great influx in scientific publications as well other sources of text corpora that are used for conservation research. This surge holds much promise in promoting great advancements in science, but also presents new challenges. One of the great issues of utilizing this plethora of information is how to efficiently sort through it and retain only its relevant sections. Homonyms - terms that share spelling but differ in meaning - present a unique challenge within this respect as they do not contain inherent information that can aid in their classification across narratives. This issue is of relevance for an array of different conservation culturomics studies, as homonyms add a lot of noise to results which cannot be easily identified. In this work we constructed a semi-automated approach that can aid in the classification of homonyms between narratives. We used a combination of automated content analysis and artificial neural networks to quickly and accurately sift through large corpora of academic texts and classify them to distinct topics. As an example, we explore the use of the word 'reintroduction' in academic texts. Reintroduction is used within the conservation context to indicate the release of organisms to their former native habitat, however an 'ISI' search using this word returns thousands of publications that use this term with other meanings and contexts. Using our method, we were able to quickly and correctly classify thousands of academic texts with more than 99% accuracy between conservation related and unrelated publications. Our approach can be easily used with any other homonym terms and can greatly facilitate sorting data in cases where homonyms hinder the harnessing of large text corpora. Beyond homonyms we see great promise in the combination of automated content analyses and machine learning methods in handling and screening big data for relevant information.
...
Publisher
Open Science Centre, University of JyväskyläConference
ECCB2018: 5th European Congress of Conservation Biology. 12th - 15th of June 2018, Jyväskylä, Finland
Original source
https://peerageofscience.org/conference/eccb2018/107550/Metadata
Show full item recordCollections
- ECCB 2018 [712]
License
Related items
Showing items with similar title or keywords.
-
The Impact of Regularization on Convolutional Neural Networks
Zeeshan, Khaula (2018)Syvä oppiminen (engl. deep learning) on viime aikoina tullut suosituimmaksi koneoppimisen menetelmäksi. Konvoluutio(hermo)verkko on yksi suosituimmista syvän oppimisen arkkitehtuureista monimutkaisiin ongelmiin kuten kuvien ... -
Using deep neural networks for kinematic analysis : challenges and opportunities
Cronin, Neil J. (Elsevier BV, 2021)Kinematic analysis is often performed in a lab using optical cameras combined with reflective markers. With the advent of artificial intelligence techniques such as deep neural networks, it is now possible to perform such ... -
Neural Mechanisms of Joint Action in Musical Ensembles : Disentangling Self and Other Integration
Nijhuis, Patti; Bamford, Joshua S. (Mariani Foundation for Paediatric Neurology, 2024)Musical ensembles continuously anticipate and adapt to each other’s movements for optimal joint performance. Players must divide their attentional resources between their own actions and those of the ensemble. In improvisational ... -
On Attacking Future 5G Networks with Adversarial Examples : Survey
Zolotukhin, Mikhail; Zhang, Di; Hämäläinen, Timo; Miraghaei, Parsa (MDPI AG, 2023)The introduction of 5G technology along with the exponential growth in connected devices is expected to cause a challenge for the efficient and reliable network resource allocation. Network providers are now required to ...