The Open UniversitySkip to content
 

Challenging knowledge extraction to support the curation of documentary evidence in the humanities

Daga, Enrico and Motta, Enrico (2019). Challenging knowledge extraction to support the curation of documentary evidence in the humanities. In: Third International Workshop on Capturing Scientific Knowledge (Sciknow). Collocated with the tenth International Conference on Knowledge Capture (K-CAP) (Garijo, Daniel; Markovic, Milan; Groth, Paul; Santana, Idafen and Belhajjame, Khalid eds.), Los Angeles, CA, USA.

Full text available as:
[img]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (648kB) | Preview
Google Scholar: Look up in Google Scholar

Abstract

The identification and cataloguing of documentary evidence from textual corpora is an important part of empirical research in the humanities. In this position paper, we ponder the applicability of knowledge extraction techniques to support the data acquisition process. Initially, we characterise the task by analysing the end-to-end process occurring in the data curation activity. After that, we examine general knowledge extraction tasks and discuss their relation to the problem at hand. Considering the case of the Listening Experience Database (LED), we perform an empirical analysis focusing on two roles: the 'listener' and the 'place'. The results show, among other things, how the entities are often mentioned many paragraphs away from the evidence text or are not in the source at all. We discuss the challenges emerged from the point of view of scientific knowledge acquisition.

Item Type: Conference or Workshop Item
Project Funding Details:
Funded Project NameProject IDFunding Body
The Listening Experience Database (A-11-031-DR)AH/J013986/1AHRC (Arts & Humanities Research Council)
Keywords: documentary evidence, knowledge extraction, named entity recognition, DBpedia
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Item ID: 67961
Depositing User: Enrico Daga
Date Deposited: 04 Nov 2019 09:36
Last Modified: 13 Nov 2019 17:46
URI: http://oro.open.ac.uk/id/eprint/67961
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU