The Open UniversitySkip to content

Linking Textual Resources to Support Information Discovery

Knoth, Petr (2015). Linking Textual Resources to Support Information Discovery. PhD thesis The Open University.

Full text available as:
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (4MB) | Preview
Google Scholar: Look up in Google Scholar


A vast amount of information is today stored in the form of textual documents, many of which are available online. These documents come from different sources and are of different types. They include newspaper articles, books, corporate reports, encyclopedia entries and research papers. At a semantic level, these documents contain knowledge, which was created by explicitly connecting information and expressing it in the form of a natural language. However, a significant amount of knowledge is not explicitly stated in a single document, yet can be derived or discovered by researching, i.e. accessing, comparing, contrasting and analysing, information from multiple documents. Carrying out this work using traditional search interfaces is tedious due to information overload and the difficulty of formulating queries that would help us to discover information we are not aware of.

In order to support this exploratory process, we need to be able to effectively navigate between related pieces of information across documents. While information can be connected using manually curated cross-document links, this approach not only does not scale, but cannot systematically assist us in the discovery of sometimes non-obvious (hidden) relationships. Consequently, there is a need for automatic approaches to link discovery.

This work studies how people link content, investigates the properties of different link types, presents new methods for automatic link discovery and designs a system in which link discovery is applied on a collection of millions of documents to improve access to public knowledge.

Item Type: Thesis (PhD)
Copyright Holders: 2015 Petr Knoth
Keywords: online bibliographic searching; information retrieval; information storage; Internet searching; Semantic Web; semantic computing; CORE
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Research Group: Centre for Research in Computing (CRC)
Big Scientific Data and Text Analytics Group (BSDTAG)
Item ID: 42677
Depositing User: Petr Knoth
Date Deposited: 26 May 2015 13:59
Last Modified: 03 Apr 2019 02:36
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU