Copy the page URI to the clipboard
Daga, Enrico and Motta, Enrico
(2019).
Abstract
The identification and cataloguing of documentary evidence from textual corpora is an important part of empirical research in the humanities. In this position paper, we ponder the applicability of knowledge extraction techniques to support the data acquisition process. Initially, we characterise the task by analysing the end-to-end process occurring in the data curation activity. After that, we examine general knowledge extraction tasks and discuss their relation to the problem at hand. Considering the case of the Listening Experience Database (LED), we perform an empirical analysis focusing on two roles: the 'listener' and the 'place'. The results show, among other things, how the entities are often mentioned many paragraphs away from the evidence text or are not in the source at all. We discuss the challenges emerged from the point of view of scientific knowledge acquisition.
Viewing alternatives
Download history
Item Actions
Export
About
- Item ORO ID
- 67961
- Item Type
- Conference or Workshop Item
- Project Funding Details
-
Funded Project Name Project ID Funding Body The Listening Experience Database (A-11-031-DR) AH/J013986/1 AHRC (Arts & Humanities Research Council) - Keywords
- documentary evidence, knowledge extraction, named entity recognition, DBpedia
- Academic Unit or School
-
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM) - Research Group
- Centre for Research in Computing (CRC)
- Depositing User
- Enrico Daga