Copy the page URI to the clipboard
Cano Basave, Amparo Elizabeth; He, Yulan; Liu, Kang and Zhao, Jun
(2013).
URL: http://lang.cs.tut.ac.jp/ijcnlp2013/
Abstract
Social streams have proven to be the most up-to-date and inclusive information on current events. In this paper we propose a novel probabilistic modelling framework, called violence detection model (VDM), which enables the identification of text containing violent content and extraction of violence-related topics over social media data. The proposed VDM model does not require any labeled corpora for training, instead, it only needs the incorporation of word prior knowledge which captures whether a word indicates violence or not. We propose a novel approach of deriving word prior knowledge using the relative entropy measurement of words based on the intuition that low entropy words are indicative of semantically coherent topics and therefore more informative, while high entropy words indicates words whose usage is more topical diverse and therefore less informative. Our proposed VDM model has been evaluated on the TREC Microblog 2011 dataset to identify topics related to violence. Experimental results show that deriving word priors using our proposed relative entropy method is more effective than the widely-used information gain method. Moreover, VDM gives higher violence classification results and produces more coherent violence-related topics compared to a few competitive baselines.
Viewing alternatives
Download history
Item Actions
Export
About
- Item ORO ID
- 41416
- Item Type
- Conference or Workshop Item
- ISBN
- 4-9907348-0-7, 978-4-9907348-0-0
- Project Funding Details
-
Funded Project Name Project ID Funding Body Not Set EP/J020427/1 EPRSC and DSTL Visiting Fellowship Not Set National Laboratory of Pattern Recognition, Insitute of Automation, Chinese Academy of Sciences - Keywords
- violence detection; social media
- Academic Unit or School
-
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM) - Research Group
- Centre for Research in Computing (CRC)
- Copyright Holders
- © 2013 Asian Federation of Natural Language Processing
- Depositing User
- Amparo Cano Basave