The Open UniversitySkip to content

Corpus profiling with Nootropia

Nanas, Nikolaos and De Roeck, Anne (2008). Corpus profiling with Nootropia. In: BCS-IRSG Workshop on Corpus Profiling, 18 Oct 2008, London.

Full text available as:
Full text not publicly available (Version of Record)
Due to publisher licensing restrictions, this file is not available for public download
Google Scholar: Look up in Google Scholar


The characteristics of different corpora influence the success of Information Retrieval and NLP methods. How to best characterise a corpus is still an unexplored research area. In this paper, we use a model that has so far been applied for user profiling in Information Filtering, to profile the corpora of the TIPSTER collection. Each corpus profile is a network of terms that allows the extraction of a series of statistical features. These features can be used to calculate the similarity between the corpora in TIPSTER. This is part of ongoing work that aims at providing a corpus profiling service that will map corpora to their features and to the corresponding experimental results of various models and techniques.

Item Type: Conference or Workshop Item
Copyright Holders: 2008 Not known
Keywords: corpus profiling, Nootropia
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Related URLs:
Item ID: 27984
Depositing User: Catherine McNulty
Date Deposited: 08 Feb 2011 10:10
Last Modified: 15 Dec 2018 09:41
Share this page:

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU