The Open UniversitySkip to content
 

SenZi: A Sentiment Analysis Lexicon for the Latinised Arabic (Arabizi)

Tobaili, Taha; Fernandez, Miriam; Alani, Harith; Sharafeddine, Sanaa; Hajj, Hazem and Glavas, Goran (2019). SenZi: A Sentiment Analysis Lexicon for the Latinised Arabic (Arabizi). In: International Conference Recent Advances In Natural Language Processing 2019 Natural Language Processing in a Deep Learning World: Proceedings (Angelova, Galia; Mitkov, Ruslan; Nikolova, Ivelina and Temnikova, Irina eds.) pp. 1204–1212.

Full text available as:
[img]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (695kB) | Preview
Google Scholar: Look up in Google Scholar

Abstract

Arabizi is an informal written form of dialectal Arabic transcribed in Latin alphanumeric characters. It has a proven popularity on chat platforms and social media, yet it suffers from a severe lack of natural language processing (NLP) resources. As such, texts written in Arabizi are often disregarded in sentiment analysis tasks for Arabic. In this paper we describe the creation of a sentiment lexicon for Arabizi that was enriched with word embeddings. The result is a new Arabizi lexicon consisting of 11.3K positive and 13.3K negative words. We evaluated this lexicon by classifying the sentiment of Arabizi tweets achieving an F1-score of 0.72. We provide a detailed error analysis to present the challenges that impact the sentiment analysis of Arabizi.

Item Type: Conference or Workshop Item
ISBN: 954-452-056-2, 978-954-452-056-4
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Item ID: 66829
Depositing User: Taha Tobaili
Date Deposited: 20 Sep 2019 15:20
Last Modified: 02 Oct 2019 13:47
URI: http://oro.open.ac.uk/id/eprint/66829
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU