The Open UniversitySkip to content
 

Alleviating data sparsity for Twitter sentiment analysis

Saif, Hassan; He, Yulan and Alani, Harith (2012). Alleviating data sparsity for Twitter sentiment analysis. In: 2nd Workshop on Making Sense of Microposts (#MSM2012): Big things come in small packages at the 21st International Conference on theWorld Wide Web (WWW'12), 16 Apr 2012, Lyon, France, CEUR Workshop Proceedings (CEUR-WS.org), pp. 2–9.

Full text available as:
[img]
Preview
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (637kB) | Preview
URL: http://www2012.wwwconference.org/
Google Scholar: Look up in Google Scholar

Abstract

Twitter has brought much attention recently as a hot research topic in the domain of sentiment analysis. Training sentiment classifiers from tweets data often faces the data sparsity problem partly due to the large variety of short and irregular forms introduced to tweets because of the 140-character limit. In this work we propose using two different sets of features to alleviate the data sparseness problem. One is the semantic feature set where we extract semantically hidden concepts from tweets and then incorporate them into classifier training through interpolation. Another is the sentiment-topic feature set where we extract latent topics and the associated topic sentiment from tweets, then augment the original feature space with these sentiment-topics. Experimental results on the Stanford Twitter Sentiment Dataset show that both feature sets outperform the baseline model using unigrams only. Moreover, using semantic features rivals the previously reported best result. Using sentiment topic features achieves 86.3% sentiment classification accuracy, which outperforms existing approaches.

Item Type: Conference or Workshop Item
Copyright Holders: 2012 The Authors
Project Funding Details:
Funded Project NameProject IDFunding Body
ROBUSTGrant number 257859EU
Keywords: microblogs; sentiment analysis; opinion mining; Twitter; semantic smoothing; data sparsity
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Related URLs:
Item ID: 38501
Depositing User: Harith Alani
Date Deposited: 25 Sep 2013 10:08
Last Modified: 07 Dec 2018 11:06
URI: http://oro.open.ac.uk/id/eprint/38501
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU