The Open UniversitySkip to content
 

Self-training from labeled features for sentiment analysis

He, Yulan and Zhou, Deyu (2011). Self-training from labeled features for sentiment analysis. Information Processing & Management, 47(4) pp. 606–616.

Full text available as:
Full text not publicly available
Due to copyright restrictions, this file is not available for public download
Click here to request a copy from the OU Author.
DOI (Digital Object Identifier) Link: http://dx.doi.org/10.1016/j.ipm.2010.11.003
Google Scholar: Look up in Google Scholar

Abstract

Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort.
In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment labels of those lexicon words being expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model’s predictions on unlabeled instances. Experiments on both the movie-review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than existing weakly-supervised sentiment classification methods despite using no labeled documents.

Item Type: Journal Article
Copyright Holders: 2010 Elsevier Ltd.
ISSN: 0306-4573
Keywords: sentiment analysis; opinion mining; self-training; generalized expectation; self-learned features
Academic Unit/Department: Knowledge Media Institute
Interdisciplinary Research Centre: Centre for Research in Computing (CRC)
Item ID: 28547
Depositing User: Kay Dave
Date Deposited: 19 Apr 2011 16:29
Last Modified: 24 Oct 2012 05:01
URI: http://oro.open.ac.uk/id/eprint/28547
Share this page:

Actions (login may be required)

View Item
Report issue / request change

Policies | Disclaimer

© The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk