The Open UniversitySkip to content

A Bayesian mixture model for term re-occurrence and burstiness

Sarkar, Avik; Garthwaite, Paul and De Roeck, Anne (2005). A Bayesian mixture model for term re-occurrence and burstiness. In: Ninth Conference on Computational Language Learning (CoNLL), 29-30 June 2005, Ann Arbor, Michigan, USA.

Full text available as:
PDF (Not Set) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (115Kb)
Google Scholar: Look up in Google Scholar


This paper proposes a model for term reoccurrence in a text collection based on the gaps between successive occurrences of a term. These gaps are modeled using
a mixture of exponential distributions. Parameter
estimation is based on a Bayesian framework that allows us to fit a flexible model. The model provides measures of a term’s re-occurrence rate and withindocument burstiness. The model works for all kinds of terms, be it rare content
word, medium frequency term or frequent function word. A measure is proposed to account for the term’s importance based on its distribution pattern in the corpus.

Item Type: Conference Item
Keywords: term distribution modelling; term burstiness; natural language processing; Bayesian modelling
Academic Unit/Department: Mathematics, Computing and Technology > Computing & Communications
Mathematics, Computing and Technology
Mathematics, Computing and Technology > Mathematics and Statistics
Other Departments > Vice-Chancellor's Office
Other Departments
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Item ID: 5003
Depositing User: Anne De Roeck
Date Deposited: 18 Jul 2006
Last Modified: 24 Feb 2016 04:59
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

▼ Automated document suggestions from open access sources

Actions (login may be required)

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340