Automatic stopword generation using contextual semantics for sentiment analysis of Twitter

Saif, Hassan; Fernández, Miriam and Alani, Harith (2014). Automatic stopword generation using contextual semantics for sentiment analysis of Twitter. In: Proceedings of the ISWC 2014 Posters & Demonstrations Track.

URL: http://ceur-ws.org/Vol-1272/paper_55.pdf

Abstract

In this paper we propose a semantic approach to automatically identify and remove stopwords from Twitter data. Unlike most existing approaches, which rely on outdated and context-insensitive stopword lists, our proposed approach considers the contextual semantics and sentiment of words in order to measure their discrimination power. Evaluation results on 6 Twitter datasets show that, removing our semantically identified stopwords from tweets, increases the binary sentiment classification performance over the classic pre-complied stopword list by 0.42% and 0.94% in accuracy and F-measure respectively. Also, our approach reduces the sentiment classifier's feature space by 48.34% and the dataset sparsity by 1.17%, on average, compared to the classic method.

Viewing alternatives

Download history

Item Actions

Export

About

Recommendations