The Open UniversitySkip to content
 

An effective approach to verbose queries using a limited dependencies language model

Hoenkamp, Eduard; Bruza, Peter; Song, Dawei and Huang, Qiang (2009). An effective approach to verbose queries using a limited dependencies language model. In: 2nd International Conference on Theory of Information Retrieval (ICTIR2009), 10-12 September 2009, Cambridge.

Full text available as:
[img]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (163Kb) | Preview
URL: http://www.springerlink.com/content/qx457g2715n0hg...
DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1007/978-3-642-04417-5_11
Google Scholar: Look up in Google Scholar

Abstract

Intuitively, any ‘bag of words’ approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distri- butions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document’s initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur’s search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophis- ticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.

Item Type: Conference Item
Copyright Holders: 2009 Springer-Verlag
ISSN: 0302-9743
Extra Information: Advances in Information Retrieval Theory
Second International Conference on the Theory of Information Retrieval, ICTIR 2009
Cambridge, UK, September 10-12, 2009
Proceedings
Edited by Leif Azzopardi, Gabriella Kazai, Stephen Robertson, Stefan Rüger, Milad Shokouhi, Dawei Song, Emine Yilmaz
Lecture Notes in Computer Science, 5766
ISBN-13 978-3-642-04416-8
pp.116-127
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Knowledge Media Institute
Related URLs:
Item ID: 35328
Depositing User: Dawei Song
Date Deposited: 14 Nov 2012 14:09
Last Modified: 14 Nov 2012 21:28
URI: http://oro.open.ac.uk/id/eprint/35328
Share this page:

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk