Bias-variance decomposition of IR evaluation

Zhang, Peng; Song, Dawei; Wang, Jun and Hou, Yuexian (2013). Bias-variance decomposition of IR evaluation. In: 36th international ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2013), 28 Jul - 1 Aug 2013, Dublin, Ireland, pp. 1021–1024.




It has been recognized that, when an information retrieval (IR) system achieves improvement in mean retrieval effectiveness (e.g. mean average precision (MAP)) over all the queries, the performance (e.g., average precision (AP)) of some individual queries could be hurt, resulting in retrieval instability. Some stability/robustness metrics have been proposed. However, they are often defined separately from the mean effectiveness metric. Consequently, there is a lack of a unified formulation of effectiveness, stability and overall retrieval quality (considering both). In this paper, we present a unified formulation based on the bias-variance decomposition. Correspondingly, a novel evaluation methodology is developed to evaluate the effectiveness and stability in an integrated manner. A case study applying the proposed methodology to evaluation of query language modeling illustrates the usefulness and analytical power of our approach.

Viewing alternatives


Public Attention

Altmetrics from Altmetric

Number of Citations

Citations from Dimensions

Item Actions