The Open UniversitySkip to content
 

Adapting the Naive Bayes classifier to rank procedural texts

Yin, Ling and Power, Richard (2006). Adapting the Naive Bayes classifier to rank procedural texts. In: Lalmas, Mounia; MacFarlane, Andy; Rüger, Stefan M.; Tombros, Anastasios; Tsikrika, Theodora and Yavlinsky, Alexei eds. Advances in Information Retrieval. Lecture Notes in Computer Science, 3936. Berlin, Germany: Springer, pp. 179–190.

Full text available as:
Full text not publicly available
Due to copyright restrictions, this file is not available for public download
URL: http://www.springerlink.com/content/x0h57n741gj97r...
DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1007/11735106_17
Google Scholar: Look up in Google Scholar

Abstract

This paper presents a machine-learning approach for ranking web documents according to the proportion of procedural text they contain. By 'pro-cedural text' we refer to ordered lists of steps, which are very common in some instructional genres such as online manuals. Our initial training corpus is built up by applying some simple heuristics to select documents from a large collection and contains only a few documents with a large proportion of procedural texts. We adapt the Naive Bayes classifier to better fit this less than ideal training corpus. This adapted model is compared with several other classifiers in ranking procedural texts using different sets of features and is shown to perform well when only highly distinctive features are used.

Item Type: Book Chapter
ISBN: 3-540-33347-9, 978-3-540-33347-0
ISSN: 1611-3349
Extra Information: Proceedings of the 28th European Conference on IR Research, ECIR 2006, London, UK, April 10-12, 2006.
Keywords: machine learning; information retrieval
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Item ID: 8471
Depositing User: Richard Power
Date Deposited: 23 Jul 2007
Last Modified: 07 Apr 2011 20:25
URI: http://oro.open.ac.uk/id/eprint/8471
Share this page:

Altmetrics

Scopus Citations

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk