The Open UniversitySkip to content
 

Automatic labelling of topic models learned from Twitter by summarisation

Cano Basave, Amparo Elizabeth; He, Yulan and Xu, Ruifeng (2014). Automatic labelling of topic models learned from Twitter by summarisation. In: The 52nd Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference: Volume 2: Short Papers, Association for Computational Linguistics (ACL), pp. 618–624.

Full text available as:
[img] PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (186kB)
URL: http://acl2014.org/
Google Scholar: Look up in Google Scholar

Abstract

Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel events happening in the real world. Existing automatic topic labelling approaches which depend on external knowledge sources become less applicable here since relevant articles/concepts of the extracted topics may not exist in external sources. In this paper we propose to address the problem of automatic labelling of latent topics learned from Twitter as a summarisation problem. We introduce a framework which apply summarisation algorithms to generate topic labels. These algorithms are independent of external sources and only rely on the identification of dominant terms in documents related to the latent topic. We compare the efficiency of existing state of the art summarisation algorithms. Our results suggest that summarisation algorithms generate better topic labels which capture event-related context compared to the top-n terms returned by LDA.

Item Type: Conference or Workshop Item
Copyright Holders: 2014 Association for Computational Linguistics
ISBN: 1-937284-73-5, 978-1-937284-73-2
Project Funding Details:
Funded Project NameProject IDFunding Body
Not SetEP/J020427/1EPRSC
EU-FP7 project SENSE4US611242EU
Not SetGJHZ20120613110641217Shenzhen International Cooperation Research Funding
Keywords: topic models, automatic labelling
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group: Centre for Research in Computing (CRC)
Item ID: 41413
Depositing User: Amparo Cano Basave
Date Deposited: 26 Nov 2014 12:57
Last Modified: 07 Dec 2018 14:55
URI: http://oro.open.ac.uk/id/eprint/41413
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU