The Open UniversitySkip to content
 

Cross-Lingual Classification of Crisis Data

Khare, Prashant; Burel, Grégoire; Maynard, Diana and Alani, Harith (2018). Cross-Lingual Classification of Crisis Data. In: The Semantic Web – ISWC 2018 (Vrandečić, Denny; Bontcheva, Kalina; Suárez-Figueroa, Mari Carmen; Presutti, Valentina; Celino, Irene; Sabou, Marta; Kaffee, Lucie-Aimée and Simperl, Elena eds.), Lecture Notes in Computer Science, Springer, pp. 617–633.

Full text available as:
[img]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (324kB) | Preview
DOI (Digital Object Identifier) Link: https://doi.org/10.1007/978-3-030-00671-6_36
Google Scholar: Look up in Google Scholar

Abstract

Many citizens nowadays flock to social media during crises to share or acquire the latest information about the event. Due to the sheer volume of data typically circulated during such events, it is necessary to be able to efficiently filter out irrelevant posts, thus focusing attention on the posts that are truly relevant to the crisis. Current methods for classifying the relevance of posts to a crisis or set of crises typically struggle to deal with posts in different languages, and it is not viable during rapidly evolving crisis situations to train new models for each language. In this paper we test statistical and semantic classification approaches on cross-lingual datasets from 30 crisis events, consisting of posts written mainly in English, Spanish, and Italian. We experiment with scenarios where the model is trained on one language and tested on another, and where the data is translated to a single language. We show that the addition of semantic features extracted from external knowledge bases improve accuracy over a purely statistical model.

Item Type: Conference or Workshop Item
ISBN: 3-030-00670-0, 978-3-030-00670-9
ISSN: 0302-9743
Project Funding Details:
Funded Project NameProject IDFunding Body
COMRADESNot SetEC (European Commission): FP(inc.Horizon2020, H2020, ERC)
Keywords: Semantics; Cross-lingual; Multilingual; Crisis informatics; Tweet classification
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Item ID: 57253
Depositing User: Prashant Khare
Date Deposited: 30 Oct 2018 15:50
Last Modified: 03 May 2019 05:02
URI: http://oro.open.ac.uk/id/eprint/57253
Share this page:

Metrics

Altmetrics from Altmetric

Citations from Dimensions

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU