Copy the page URI to the clipboard
Sautter, Guido; Böhm, Klemens and King, David
(2013).
DOI: https://doi.org/10.1007/978-3-319-03599-4_20
URL: http://isim.ac.in/ICADL2013/Accepted_Papers_and_Po...
Abstract
Comprehensive bibliographies often rely on community contributions. In such a setting, de-duplication is mandatory for the bibliography to be useful. Ideally, it works online, i.e., during the addition of new references, so the bibliography remains duplicate-free at all times. While de-duplication is well researched, generic approaches do not achieve the result quality required for automated reconciliation. To overcome this problem, we propose a new duplicate detection and reconciliation technique called RefConcile. Aimed specifically at bibliographic references, it uses dedicated blocking and matching techniques tailored to this type of data. Our evaluation based on a large real-world collection of bibliographic references shows that RefConcile scales well, and that it detects and reconciles duplicates highly accurately.