The Open UniversitySkip to content
 

Traversing news with ant colony optimisation and negative pheromones

Sousa-Rodrigues, David and Ramos, Vitorino (2014). Traversing news with ant colony optimisation and negative pheromones. In: European Conference on Complex Systems 2014, 22-26 September 2014, Lucca, Italy.

Full text available as:
[img]
Preview
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (86kB) | Preview
URL: http://www.eccs14.eu/components/com_comunicazioni/...
Google Scholar: Look up in Google Scholar

Abstract

The past decade has seen the rapid development of the online newsroom. News published online are the main outlet of news surpassing traditional printed newspapers. This poses challenges to the production and to the consumption of those news. With those many sources of information available it is important to find ways to cluster and organise the documents if one wants to understand this new system.

Traditional approaches to the problem of clustering documents usually embed the documents in a suitable similarity space. Previous studies have reported on the impact of the similarity measures used for clustering of textual corpora [1]. These similarity measures usually are calculated for bag of words representations of the documents. This makes the final document-word matrix high dimensional. Feature vectors with more than 10,000 dimensions are common and algorithms have severe problems with the high dimensionality of the data.

A novel bio inspired approach to the problem of traversing the news is presented. It finds Hamiltonian cycles over documents published by the newspaper The Guardian. A Second Order Swarm Intelligence algorithm based on Ant Colony Optimisation was developed [2, 3] that uses a negative pheromone to mark unrewarding paths with a “no-entry” signal. This approach follows recent findings of negative pheromone usage in real ants [4].

In this case study the corpus of data is represented as a bipartite relation between documents and keywords entered by the journalists to characterise the news. A new similarity measure between documents is presented based on the Q-analysis description [5, 6, 7] of the simplicial complex formed between documents and keywords. The eccentricity between documents (two simplicies) is then used as a novel measure of similarity between documents.

The results prove that the Second Order Swarm Intelligence algorithm performs better in benchmark problems of the travelling salesman problem, with faster convergence and optimal results. The addition of the negative pheromone as a non-entry signal clearly improved the quality of the solutions. The application of the algorithm to the corpus of news of The Guardian creates a coherent navigation system among the news. This allows the users to navigate the news published during a certain period of time in a semantic sequence instead of a time sequence.

This work as broader application as it can be applied to many cases where the data is mapped to bipartite relations (e.g. protein expressions in cells, sentiment analysis, brand awareness in social media, routing problems), as it highlights the connectivity of the underlying complex system.

Item Type: Conference or Workshop Item
Copyright Holders: 2014 The Authors
Keywords: swarm intelligence; second order swarm intelligence; ant colony optimisation; aco; ants
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM) > Engineering and Innovation
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Related URLs:
Item ID: 40202
Depositing User: David Rodrigues
Date Deposited: 29 Apr 2015 15:29
Last Modified: 05 Oct 2016 05:16
URI: http://oro.open.ac.uk/id/eprint/40202
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU