Copy the page URI to the clipboard
Daga, Enrico and Motta, Enrico
(2019).
DOI: https://doi.org/10.1145/3360901.3364415
Abstract
The task of identifying pieces of evidence in texts is of fundamental importance in supporting qualitative studies in various domains, especially in the humanities. In this paper, we coin the expression themed evidence, to refer to (direct or indirect) traces of a fact or situation relevant to a theme of interest and study the problem of identifying them in texts. We devise a generic framework aimed at capturing themed evidence in texts based on a hybrid approach, combining statistical natural language processing, background knowledge, and Semantic Web technologies. The effectiveness of the method is demonstrated in a case study of a digital humanities database aimed at collecting and curating a repository of evidence of experiences of listening to music. Extensive experiments demonstrate that our hybrid approach outperforms alternative solutions. We also evidence its generality by testing it on a different use case in the digital humanities.