Copy the page URI to the clipboard
Farrell, Tracie; Araque, Oscar; Fernandez, Miriam and Alani, Harith
(2020).
DOI: https://doi.org/10.1145/3394231.3397912
Abstract
Understanding the identities, needs, realities and development of subcultures has been a long term target of sociology and cultural studies. Socio-cultural linguistics, in particular, examines the use of language and, in particular, the existence and use of neologisms, slang and jargon. These terms capture concepts and expressions that are not in common use and represent the new realities, norms and values of subcommunities. Identifying and understanding such terms, however, is a very complex task, particularly considering the vast amount of content that is currently available online for many such groups. In this paper, we propose a combination of computational and socio-linguistic methods to automatically extract new terminology from large amounts of data, using word-embeddings to semantically contextualise their meaning. As a use case, we explore subculture on the platform Reddit. More specifically, we investigate groups considered part of the manosphere, a loose online community where men’s perspectives, gripes, frustrations and desires are explicitly expressed and where women are typically targets of hostility. Characterisations of this group as a subculture are then provided, based on an in-depth analysis of the identified jargon.