Copy the page URI to the clipboard
Leedham, Maria; Lillis, Theresa and Twiner, Alison
(2021).
DOI: https://doi.org/10.1016/j.acorp.2021.100011
Abstract
Corpus linguistics is increasingly employed to explore large, publicly-available datasets such as newspaper texts, government speeches and online fora. However, comparatively few corpora exist where the subject matter concerns sensitive topics about living individuals since, due to their highly personal and confidential nature, these texts are hard to access and raise difficult ethical questions around secondary data analysis. One exception is the XXXcorpus, comprising texts written by UK-based professional social workers in the course of their daily work and now available to other researchers through the ReShare archive. This paper focuses on the challenges involved in building the XXXcorpus and the epistemological and ethical issues raised. Two key aspects of research practice are discussed: data anonymisation and dataset archiving. Specifically, the paper explores decision-making around anonymisation and an ethically-informed rationale for treating some texts as ‘not for sharing’, leading to the decision to create two corpora: one for the research team and a further anonymised and slightly reduced version for archiving. The paper explores what the XXXcorpora (Corpus 1 and Corpus 2) contribute to understandings about social work writing, the extent to which the two corpora enable different analyses and whether the existence of two corpora is problematic from a corpus linguistics perspective. The paper concludes by considering how the ethical decisions around corpus creation of sensitive texts raise questions about key principles in corpus linguistics.
Viewing alternatives
Metrics
Public Attention
Altmetrics from AltmetricNumber of Citations
Citations from DimensionsItem Actions
Export
About
- Item ORO ID
- 79927
- Item Type
- Journal Item
- ISSN
- 2666-7991
- Project Funding Details
-
Funded Project Name Project ID Funding Body Writing in professional social work practice in a changing communicative landscape ES/M008703/1 ESRC (Economic and Social Research Council) - Keywords
- Archiving; anonymisation; data sharing; social work; writing
- Academic Unit or School
-
Faculty of Wellbeing, Education and Language Studies (WELS) > Languages and Applied Linguistics > English Language & Applied Linguistics
Faculty of Wellbeing, Education and Language Studies (WELS) > Languages and Applied Linguistics
Faculty of Wellbeing, Education and Language Studies (WELS) - Copyright Holders
- © 2021 Elsevier Ltd.
- Depositing User
- ORO Import