Document Boltzmann Machines for Information Retrieval

Yu, Qian; Zhang, Peng; Hou, Yuexian; Song, Dawei and Wang, Jun (2015). Document Boltzmann Machines for Information Retrieval. In: Advances in Information Retrieval - 37th European Conference on IR Research, Lecture Notes in Computer Science, Springer International Publishing, pp. 666–671.



Probabilistic language modelling has been widely used in information retrieval. It estimates document models under the multinomial distribution assumption, and uses query likelihood to rank documents. In this paper, we aim to generalize this distribution assumption by exploring the use of fully-observable Boltzmann Machines (BMs) for document modelling. BM is a stochastic recurrent network and is able to model the distribution of multi-dimensional variables. It yields a kind of Boltzmann distribution which is more general than multinomial distribution. We propose a Document Boltzmann Machine (DBM) that can naturally capture the intrinsic connections among terms and estimate query likelihood efficiently. We formally prove that under certain conditions (with 1-order parameters learnt only), DBM subsumes the traditional document language model. Its relations to other graphical models in IR, e.g., MRF model, are also discussed. Our experiments on the document re-ranking demonstrate the potential of the proposed DBM.

Viewing alternatives


Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions