The Open UniversitySkip to content

Corpus profiling with Nootropia

Nanas, Nikolaos and De Roeck, Anne (2008). Corpus profiling with Nootropia. In: BCS-IRSG Workshop on Corpus Profiling, 18 Oct 2008, London.

Full text available as:
Full text not publicly available
Due to copyright restrictions, this file is not available for public download
Google Scholar: Look up in Google Scholar


The characteristics of different corpora influence the success of Information Retrieval and NLP methods. How to best characterise a corpus is still an unexplored research area. In this paper, we use a model that has so far been applied for user profiling in Information Filtering, to profile the corpora of the TIPSTER collection. Each corpus profile is a network of terms that allows the extraction of a series of statistical features. These features can be used to calculate the similarity between the corpora in TIPSTER. This is part of ongoing work that aims at providing a corpus profiling service that will map corpora to their features and to the corresponding experimental results of various models and techniques.

Item Type: Conference Item
Copyright Holders: 2008 Not known
Keywords: corpus profiling, Nootropia
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Mathematics, Computing and Technology
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Related URLs:
Item ID: 27984
Depositing User: Catherine McNulty
Date Deposited: 08 Feb 2011 10:10
Last Modified: 09 Feb 2011 17:13
Share this page:

Actions (login may be required)

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340