dc.contributor.advisor | Veijalainen, Jari | |
dc.contributor.author | Akhavan Rahnama, Amir | |
dc.date.accessioned | 2015-02-18T08:55:59Z | |
dc.date.available | 2015-02-18T08:55:59Z | |
dc.date.issued | 2015 | |
dc.identifier.other | oai:jykdok.linneanet.fi:1466490 | |
dc.identifier.uri | https://jyx.jyu.fi/handle/123456789/45352 | |
dc.description.abstract | Sentiment analysis on Twitter public stream has been a topic of research recently. Several non-commercial libraries and software were developed to perform sentiment analysis, however none of them performed the analytics in real-time for Twitter data. Performing the same task in real-time can gives us insight of Twitter users public opinions regarding recent happenings of the time that analysis was made. In this thesis work, we propose a full-stack architecture with a software prototype that performs real- time sentiment analysis on Twitter public stream. We address the problem using large- scale online learning and specifically online parallel decision trees. Large-scale learning is utilized due to the fact that social media website such as Twitter produce data with high volume (around 5800 tweets per second in 2014) and in addition, there is a high time constraint (up to seconds) in real-time analytics in both learning, processing and query response time. Moreover, Twitter stream data arrives instance-by-instance and therefore we have utilized online learning with incremental and per-instance learning flexibility. SAMOA is a framework that provides support for a set of scalable online learning algorithms such as Vertical Hoeffding Tree. We use SAMOA’s VHT learner with Apache Storm as our Stream Processing Engine. However, utilizing only VHT and Apache Storm cannot solve the problem at hand. Therefore, we also developed an open- source Java library called Sentinel that enables real-time Twitter stream reading, in- memory pre-processing computations and data structures, feature selection, frequent miner algorithms and etc. that completes our architecture. In Chapter 3, we show the architecture of our solution and its applicability and usefulness is shown in chapter 4. | en |
dc.format.extent | 1 verkkoaineisto (62 sivua) | |
dc.format.mimetype | application/pdf | |
dc.language.iso | eng | |
dc.rights | In Copyright | en |
dc.subject.other | sentiment analysis | |
dc.subject.other | real-time analytics | |
dc.subject.other | social media mining | |
dc.subject.other | twitter | |
dc.subject.other | large- scale learning | |
dc.subject.other | parallel decision tree | |
dc.title | Real-time sentiment analysis of Twitter public stream | |
dc.type | master thesis | |
dc.identifier.urn | URN:NBN:fi:jyu-201502181337 | |
dc.type.ontasot | Pro gradu -tutkielma | fi |
dc.type.ontasot | Master’s thesis | en |
dc.contributor.tiedekunta | Informaatioteknologian tiedekunta | fi |
dc.contributor.tiedekunta | Faculty of Information Technology | en |
dc.contributor.laitos | Tietotekniikan laitos | fi |
dc.contributor.laitos | Department of Mathematical Information Technology | en |
dc.contributor.yliopisto | University of Jyväskylä | en |
dc.contributor.yliopisto | Jyväskylän yliopisto | fi |
dc.contributor.oppiaine | Tietotekniikka | fi |
dc.contributor.oppiaine | Mathematical Information Technology | en |
dc.date.updated | 2015-02-18T08:56:00Z | |
dc.type.coar | http://purl.org/coar/resource_type/c_bdcc | |
dc.type.publication | masterThesis | |
dc.contributor.oppiainekoodi | 602 | |
dc.subject.yso | Twitter | |
dc.subject.yso | sosiaalinen media | |
dc.subject.yso | tiedonlouhinta | |
dc.format.content | fulltext | |
dc.rights.url | https://rightsstatements.org/page/InC/1.0/ | |
dc.type.okm | G2 | |