The Open UniversitySkip to content
 

The Document as ergodic markov chain

Song, Dawei and Hoenkamp, Eduard (2004). The Document as ergodic markov chain. In: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR'2004), 25-29 Jul 2004, Sheffield, United Kingdom.

DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1145/1008992.1009088
Google Scholar: Look up in Google Scholar

Abstract

In recent years, statistical language models are being proposed as alternative to the vector space model. Viewing documents as language samples introduces the issue of defining a joint probability distribution over the terms.The present paper models a document as the result of a Markov process. It argues that this process is ergodic, which is theoretically plausible, and easy to verify in practice.The theoretical result is that the joint distribution can be easily obtained. This can also be applied for search resolutions other than the document level. We verified this in an experiment on query expansion demonstrating both the validity and the practicability of the method. This holds a promise for general language models.

Item Type: Conference Item
Keywords: ergodic process; language models; semantic space
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Item ID: 9041
Depositing User: Aneta Tumilowicz
Date Deposited: 25 Sep 2007
Last Modified: 22 Jun 2012 11:37
URI: http://oro.open.ac.uk/id/eprint/9041
Share this page:

Altmetrics

Scopus Citations

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk