The Open UniversitySkip to content

Towards a universal bibliography – the RefBank approach

Sautter, Guido; King, David and Morse, David (2012). Towards a universal bibliography – the RefBank approach. In: TDWG (Biodiversity Information Standards) 2012, 22-26 Oct 2012, Beijing, PRC.

Full text available as:
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (179kB) | Preview
Google Scholar: Look up in Google Scholar


There remains no successful compilation of a universal bibliography for systematic biology. Commercial services like Mendeley or Zotero are achieving some traction. However, they are monolithic systems under the control of single entities, their biodiversity data scattered without easy data aggregation or interchange – why should competitors offer this functionality?

The ViBRANT ( project aims to compile a Bibliography of Life, building on the project’s available infrastructure, gathering data from across biodiversity sciences. Evaluating existing platforms, both commercial and scientific, we found none met the needs for a sustainable, universal bibliography, for the following reasons: (1) Too narrowly focused to achieve critical mass in data volume or in user community. (2) Monolithic systems with single points of failure and low perceived sustainability. (3) Too focused on data analysis and research, at the expense of building a reliable base system, hence appearing to be prototypes rather than stable platforms, reducing perceived reliability and sustainability. (4) Integration of data curation with data input, which makes contributing references tedious, thereby alienating potential users.

Thus, we built RefBank, following a radically different approach, applying proven principles from other forms of data management. (1) RefBank is an open, coordinator-free network of independent nodes that replicate the data among themselves, eliminating any single point of failure, achieving reliability and sustainability through redundancy. (2) No single entity governs the data; everyone can set up a node to link into the network; the web application can be downloaded from most existing nodes. Replication is pull-based, so no node can actively push erroneous data into the network. (3) Contributing is easy: everyone can upload their bibliographies, be they in BibTeX, EndNote, plain text or many other common formats, without any curation, and without prior registration; ReCAPTCHA protects the upload forms. (4) RefBank uses graph theory to embrace near duplicate references exploiting their inherent redundancy to enable automated reconciliation and curation through data mining techniques. (5) RefBank’s web interface supports manual curation, though manual curation is not required for the system to work; users can correct errors later as they find them when using the reference collection. (6) RefBank provides multiple data export formats, e.g. BibTeX and RIS, and can output references in a variety of common styles, e.g. Chicago or Harvard.

Item Type: Conference or Workshop Item
Copyright Holders: 2012 The authors
Project Funding Details:
Funded Project NameProject IDFunding Body
Not SetNot SetEuropean Union 7th Framework Programme
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Related URLs:
Item ID: 35818
Depositing User: David King
Date Deposited: 14 Jan 2013 10:45
Last Modified: 07 Dec 2018 16:01
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU