The Open UniversitySkip to content

Generalized bias-variance evaluation of TREC participated systems

Zhang, Peng; Hao, Linxue; Song, Dawei; Wang, Jun; Hou, Yuexian and Hu, Bin (2014). Generalized bias-variance evaluation of TREC participated systems. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, ACM, pp. 1911–1914.

Full text available as:
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (163kB) | Preview
DOI (Digital Object Identifier) Link:
Google Scholar: Look up in Google Scholar


Recent research has shown that the improvement of mean retrieval effectiveness (e.g., MAP) may sacrifice the retrieval stability across queries, implying a tradeoff between effectiveness and stability. The evaluation of both effectiveness and stability are often based on a baseline model, which could be weak or biased. In addition, the effectiveness-stability tradeoff has not been systematically or quantitatively evaluated over TREC participated systems. The above two problems, to some extent, limit our awareness of such tradeoff and its impact on developing future IR models. In this paper, motivated by a recently proposed bias-variance based evaluation, we adopt a strong and unbiased “baseline”, which is a virtual target model constructed by the best performance (for each query) among all the participated systems in a retrieval task. We also propose generalized bias variance metrics, based on which a systematic and quantitative evaluation of the effectiveness-stability tradeoff is carried out over the participated systems in the TREC Ad-hoc Track (1993-1999) and Web Track (2010-2012). We observe a clear effectiveness-stability tradeoff, with a trend of becoming more obvious in more recent years. This implies that when we pursue more effective IR systems over years, the stability has become problematic and could have been largely overlooked.

Item Type: Conference or Workshop Item
Copyright Holders: 2014 ACM
ISBN: 1-4503-2598-X, 978-1-4503-2598-1
Project Funding Details:
Funded Project NameProject IDFunding Body
Not Set61402324Natural Science Foundation of China
Not Set61272265Natural Science Foundation of China
Not Set61105072Natural Science Foundation of China
973 Program2013CB329304Chinese National Program on Key Basic Research Project
973 Program2014CB744604Chinese National Program on Key Basic Research Project
Keywords: evaluation; effectiveness-stability tradeoff; biasvariance tradeoff; virtual target model
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Item ID: 40779
Depositing User: Dawei Song
Date Deposited: 03 Sep 2014 15:50
Last Modified: 07 Dec 2018 23:02
Share this page:


Altmetrics from Altmetric

Citations from Dimensions

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU