The Open UniversitySkip to content
 

Detecting Personal Life Events from Social Media

Dickinson, Thomas Kier (2019). Detecting Personal Life Events from Social Media. PhD thesis. The Open University.

Full text available as:
[img]
Preview
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview
Google Scholar: Look up in Google Scholar

Abstract

Social media has become a dominating force over the past 15 years, with the rise of sites such as Facebook, Instagram, and Twitter. Some of us have been with these sites since the start, posting all about our personal lives and building up a digital identify of ourselves.

But within this myriad of posts, what actually matters to us, and what do our digital identities tell people about ourselves? One way that we can start to filter through this data, is to build classifiers that can identify posts about our personal life events, allowing us to start to self reflect on what we share online.

The advantages of this type of technology also have direct merits within marketing, allowing companies to target customers with better products. We also suggest that the techniques and methodologies built throughout this thesis also have opportunities to support research within other areas such as cyber bullying, and radicalisation detection.

The aim of this thesis is to build upon the under researched area of life event detection, specifically targeting Twitter, and Instagram. Our goal is to develop classifiers that identify a list of life events inspired by cognitive psychology, where we target a total of seven within this thesis.

To achieve this we look to answer three research questions covered in each of our empirical chapters. In our first empirical chapter, we ask; What features would improve the classification of important life events. To answer this, we look at first extracting a new dataset from Twitter targeting the following events: Getting Married, Having Children, Starting School, Falling in Love, and Death of a Parent. We look at three new feature sets: interactions, content, and semantic features, and compare against a current state of the art technique.

In our second empirical chapter, we draw inspiration from cheminformatics, and frequent sub-graph mining to ask; Could the inclusion of semantic and syntactic patterns improve performance in our life event classifier. Here we look at expanding our tweets into semantic networks, as well as consider two forms of syntactic relationships between tokens. We then mine for frequent sub-graphs amongst our tweet graphs, and use these as features in our classifier. Our results produce F1 scores of between 0.65 and 0.77, providing an improvement between 0.01 and 0.04 against the current state of the art.

In our final empirical chapter, we look to answer our third research question; How can we detect important life events from other social media sites, such as Instagram?. We ask this question, as we believe Instagram to be a preferred environment to share personal life events. In this chapter, we extract a new dataset, targeting the following events: Getting Married, Having Children, Starting School, Graduation, and Buying a House. Our results find that our methodology provides F1 scores between 0.78, and 0.82, an improvement in F1 score between 0.01 and 0.04 against the current state of the art.

Item Type: Thesis (PhD)
Copyright Holders: 2018 The Author
Keywords: digital media; interpersonal communication; data protection; personal information management; online identities; privacy; online social networks; data mining
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM)
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Knowledge Media Institute (KMi)
Item ID: 68265
Depositing User: Tom Dickinson
Date Deposited: 23 Dec 2019 15:50
Last Modified: 14 Jun 2020 17:30
URI: http://oro.open.ac.uk/id/eprint/68265
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU