Näytä suppeat kuvailutiedot

dc.contributor.advisorCostin, Andrei
dc.contributor.authorLampinen, Kenneth
dc.date.accessioned2020-12-28T08:35:28Z
dc.date.available2020-12-28T08:35:28Z
dc.date.issued2020
dc.identifier.urihttps://jyx.jyu.fi/handle/123456789/73442
dc.description.abstractThe proliferation of IoT devices brings many cyber security challenges. Identifying executable code with known vulnerabilities is one of them, this despite the fact that open source code is commonly used in IoT firmware. Factors that contribute to this challenge include the high usage of heterogeneous architectures, as well as non-standard toolsets and compilers when developing IoT firmware. To address this issue, this work examines the latest research in bi-nary code matching. It concludes that the research does not adequately address the current cyber security issues incurred by IoT devices and proposes a new method of binary code matching based on techniques and methods commonly seen in Natural Language Processing (NLP). An artefact using Google’s BERT and a custom bi-directional LSTM Siamese network is developed and tested to demonstrate the viability of this new method. The BERT model was pre-trained using the code sections of binary executables compiled for the ARM architecture. It achieved scores of 89.1% and 98.0% in the key metrics of masked_lm_accuracy and next_sentence_accuracy respectively. This pre-trained BERT model was used to extract embeddings from the binary files’ code sections in order to train and validate the Siamese network. The Siamese network achieved an average rate of approximately 80% on the task of match-ing the stripped code sections of binary files compiled by two separate open source projects. This compares favorably to the 0% accuracy achieved by the fuzzy hashing algorithms SSDEEP and SDHASH.en
dc.format.extent99
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subject.otherbinary file matching
dc.subject.otherdeep learning
dc.subject.otherNatural Language Processing
dc.subject.otherNLP
dc.subject.otherBERT
dc.subject.othertransformer
dc.subject.otherLSTM
dc.subject.otherSiamese network
dc.subject.othersimilarity detection
dc.subject.otherSSDEEP
dc.subject.otherSDHASH
dc.titleArchitecture-independent matching of stripped binary code files using BERT and a Siamese neural network
dc.identifier.urnURN:NBN:fi:jyu-202012287374
dc.type.ontasotPro gradu -tutkielmafi
dc.type.ontasotMaster’s thesisen
dc.contributor.tiedekuntaInformaatioteknologian tiedekuntafi
dc.contributor.tiedekuntaFaculty of Information Technologyen
dc.contributor.laitosInformaatioteknologiafi
dc.contributor.laitosInformation Technologyen
dc.contributor.yliopistoJyväskylän yliopistofi
dc.contributor.yliopistoUniversity of Jyväskyläen
dc.contributor.oppiaineTietojenkäsittelytiedefi
dc.contributor.oppiaineComputer Scienceen
dc.rights.copyrightJulkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.fi
dc.rights.copyrightThis publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.en
dc.rights.accesslevelrestrictedAccess
dc.type.publicationmasterThesis
dc.contributor.oppiainekoodi601
dc.subject.ysokyberturvallisuus
dc.subject.ysokoneoppiminen
dc.subject.ysoesineiden internet
dc.subject.ysocyber security
dc.subject.ysomachine learning
dc.subject.ysoInternet of things
dc.format.contentfulltext
dc.rights.accessrightsTekijä ei ole antanut lupaa avoimeen julkaisuun, joten aineisto on luettavissa vain Jyväskylän yliopiston kirjaston <a href="https://kirjasto.jyu.fi/fi/tyoskentelytilat/laitteet-ja-tilat">arkistotyöasemalta</a>.fi
dc.rights.accessrights<br><br>The author has not given permission to make the work publicly available electronically. Therefore the material can be read only at the archival <a href="https://kirjasto.jyu.fi/en/workspaces/facilities">workstation</a> at Jyväskylä University Library reserved for the use of archival materials.en
dc.type.okmG2


Aineistoon kuuluvat tiedostot

Thumbnail

Aineisto kuuluu seuraaviin kokoelmiin

Näytä suppeat kuvailutiedot