Taming big knowledge evolution

Cochez, Michael

dc.contributor.author	Cochez, Michael
dc.date.accessioned	2016-05-16T10:05:44Z
dc.date.available	2016-05-16T10:05:44Z
dc.date.issued	2016
dc.identifier.isbn	978-951-39-6649-2
dc.identifier.other	oai:jykdok.linneanet.fi:1541145
dc.identifier.uri	https://jyx.jyu.fi/handle/123456789/49793
dc.description.abstract	Information and its derived knowledge are not static. Instead, information is changing over time and our understanding of it evolves with our ability and willingness to consume the information. When compared to humans, current computer systems seem very limited in their ability to really understand the meaning of things. On the other hand, they are very powerful when it comes down to performing exact computations. One aspect which sets humans apart from machines when trying to understand the world is that we will often make mistakes, forget information, or choose what to focus on. To put this in another perspective, it seems like humans can behave somehow more randomly and still outperform machines in knowledge related tasks. In computer science there is a branch of research concerned with allowing randomness or inaccuracy in algorithms, which are then called approximate algorithms. The main beneﬁt of using these algorithms is that they are often much faster than their exact counterparts, at the cost of producing wrong or inexact results, once in a while. So, these algorithms could be used in contexts where erring once in while does not harm. If the chance of making a mistake is very slim, say lower than the chance of a memory error, then the expected precision will rival their exact counterparts. Furthermore, the input data to the algorithms often already contains a fair amount of uncertainty, such that the small error which the approximate algorithm introduces becomes more or less insigniﬁcant. In this dissertation, the author investigates the use of familiar and new approximate algorithms to knowledge discovery and evolution. The main contributions of the dissertation are a) an abstract formulation of what it means for an ontology to be and stay optimal over time, b) a contribution to a vision paper regarding the future of evolving knowledge ecosystems, c) an investigation of the application of locality-sensitive hashing (LSH) in the context of ontology matching and semantic search, d) the twister tries algorithm which is a novel approximate hierarchical clustering approach with linear space and time constraints, and e) an extension on the twister tries algorithm which trades a longer, but adaptable running time for a likely improvement of the clustering result.
dc.format.extent	1 verkkoaineisto (56 sivua, 85 sivua useina numerointijaksoina)
dc.language.iso	eng
dc.publisher	University of Jyväskylä
dc.relation.ispartofseries	Jyväskylä studies in computing
dc.subject.other	knowledge evolution
dc.subject.other	hierarchial clustering
dc.subject.other	information retrieval
dc.title	Taming big knowledge evolution
dc.type	Diss.
dc.identifier.urn	URN:ISBN:978-951-39-6649-2
dc.type.dcmitype	Text	en
dc.type.ontasot	Väitöskirja	fi
dc.type.ontasot	Doctoral dissertation	en
dc.contributor.tiedekunta	Informaatioteknologian tiedekunta	fi
dc.contributor.yliopisto	University of Jyväskylä	en
dc.contributor.yliopisto	Jyväskylän yliopisto	fi
dc.contributor.oppiaine	Tietotekniikka	fi
dc.relation.issn	1456-5390
dc.relation.numberinseries	237
dc.rights.accesslevel	openAccess	fi
dc.subject.yso	tiedonlouhinta
dc.subject.yso	big data
dc.subject.yso	tiedonhaku
dc.subject.yso	tiedonhakujärjestelmät
dc.subject.yso	ontologiat
dc.subject.yso	semanttinen web
dc.subject.yso	algoritmit
dc.subject.yso	optimointi
dc.subject.yso	matemaattinen optimointi
dc.subject.yso	geneettiset algoritmit
dc.subject.yso	klusterianalyysi