Näytä suppeat kuvailutiedot

dc.contributor.advisorKhriyenko, Oleksiy
dc.contributor.authorHajihashemi Varnousfaderani, Elahe
dc.date.accessioned2024-01-08T07:20:46Z
dc.date.available2024-01-08T07:20:46Z
dc.date.issued2023
dc.identifier.urihttps://jyx.jyu.fi/handle/123456789/92552
dc.description.abstractInformation Retrieval systems such as search engines, originally designed to assist users in finding information, have evolved to become more potent and have found utility in wider range of applications by incorporating contextual comprehension using Language Models. Selecting the proper Language Model corresponding to the desired task is a challenging multi-objectives problem as each model has specific set of attributes which affect the performance. Accuracy, resource and time consumption are the most important objectives considered in assessing the quality of a search system. These objectives are addressed in this research by exploring the performance of two Language Models with variant characteristics in developing a semantic search pipeline. The studied Language Models include a distilled version of BERT model fine-tuned on specific task and GPT-2 as a general pre-trained model with huge number of parameters. The semantic search pipeline consisting of mapping the contents and queries into a common vector space using Large Language Model and finding the most relevant results is implemented in this study as experimental set up of the qualitative research. Utilizing evaluation metrics to assess the model’s performance necessitates the availability of ground truth data. Therefore, current research brings up various approaches aimed at generating synthetic ground truth to tackle evaluation and fine-tuning challenges when labeled data is scarce. To follow the research objectives, quantitative data is gathered through an experimental setting and conclusions are drawn and recommendations are raised by analyzing the results of the experiments. The experimental results indicate the size of the model should not be the major criterion in selecting the language model for downstream tasks. The model architecture and being fine-tuned on special dataset will dramatically affect the performance as well. As it is shown by results, the smaller fine-tuned model for semantic textual similarity surpasses the larger general model. The experiment on investigating the proposed approaches for generating annotations signifies that those methods are decently applicable in computing evaluation metrics and can be extended to fine-tuning. The results demonstrate that the task-oriented transferred learning by distillation and fine-tuning can compromise the learning capacity instilled in general models by a larger number of parameters, but it should be investigated in future research regarding the values set to various variables in this research e.g., the number of tokens considered in splitting the large text into smaller chunks. Moreover, it would be worthful to fine-tune the general large model as well in the future to compare them in a more comparable condition.en
dc.format.extent157
dc.language.isoeng
dc.rightsIn Copyright
dc.subject.othersemantic search
dc.subject.otherlarge language models
dc.subject.othergenerative models
dc.subject.otherfine-tuning
dc.subject.othertransfer learning
dc.titleChallenges and insights in semantic search using language models
dc.typemaster thesis
dc.identifier.urnURN:NBN:fi:jyu-202401081055
dc.type.ontasotMaster’s thesisen
dc.type.ontasotPro gradu -tutkielmafi
dc.contributor.tiedekuntaFaculty of Information Technologyen
dc.contributor.tiedekuntaInformaatioteknologian tiedekuntafi
dc.contributor.laitosInformation Technologyen
dc.contributor.laitosInformaatioteknologiafi
dc.contributor.yliopistoUniversity of Jyväskyläen
dc.contributor.yliopistoJyväskylän yliopistofi
dc.contributor.oppiaineMathematical Information Technologyen
dc.contributor.oppiaineTietotekniikkafi
dc.type.coarhttp://purl.org/coar/resource_type/c_bdcc
dc.rights.copyright© The Author(s)
dc.rights.accesslevelopenAccess
dc.type.publicationmasterThesis
dc.contributor.oppiainekoodi602
dc.subject.ysoluonnollisen kielen käsittely
dc.subject.ysotiedonhaku
dc.subject.ysomallintaminen
dc.subject.ysotekoäly
dc.subject.ysokoneoppiminen
dc.subject.ysonatural language processing
dc.subject.ysoinformation retrieval
dc.subject.ysomodelling (representation)
dc.subject.ysoartificial intelligence
dc.subject.ysomachine learning
dc.rights.urlhttps://rightsstatements.org/page/InC/1.0/


Aineistoon kuuluvat tiedostot

Thumbnail

Aineisto kuuluu seuraaviin kokoelmiin

Näytä suppeat kuvailutiedot

In Copyright
Ellei muuten mainita, aineiston lisenssi on In Copyright