The Open UniversitySkip to content
 

Semantics and statistics for automated image annotation

Llorente, Ainhoa (2010). Semantics and statistics for automated image annotation. PhD thesis, The Open University.
Full text available as:
[img]
Preview
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (2655Kb) | Preview
    Google Scholar Look up in Google Scholar

    Abstract

    Automated image annotation consists of a number of techniques that aim to find the correlation between words and image features such as colour, shape, and texture to provide correct annotation words to images. In particular, approaches based on Bayesian theory use machine-learning techniques to learn statistical models from a training set of pre-annotated images and apply them to generate annotations for unseen images.

    The focus of this thesis lies in demonstrating that an approach, which goes beyond learning the statistical correlation between words and visual features and also exploits information about the actual semantics of the words used in the annotation process, is able to improve the performance of probabilistic annotation systems. Specifically, I present three experiments. Firstly, I introduce a novel approach that automatically refines the annotation words generated by a non-parametric density estimation model using semantic relatedness measures. Initially, I consider semantic measures based on co-occurrence of words in the training set. However, this approach can exhibit limitations, as its performance depends on the quality and coverage provided by the training data. For this reason, I devise an alternative solution that combines semantic measures based on knowledge sources, such as WordNet and Wikipedia, with word co-occurrence in the training set and on the web, to achieve statistically significant results over the baseline. Secondly, I investigate the effect of using semantic measures inside an evaluation measure that computes the performance of an automated image annotation system, whose annotation words adopt the hierarchical structure of an ontology. This is the case of the ImageCLEF2009 collection. Finally, I propose a Markov Random Field that exploits the semantic context dependencies of the image. The best result obtains a mean average precision of 0.32, which is consistent with the state-of-the-art in automated image annotation for the Corel 5k dataset.

    Item Type: Thesis (PhD)
    Copyright Holders: 2010 Ainhoa Llorente
    Funders: The Open University, Santander Universities, Robotiker-Tecnalia
    Keywords: automated image annotation; statistical methods; semantic; Markov Random Fields; semantic similarity; visual image features; WordNet; Wikipedia
    Academic Unit/Department: Knowledge Media Institute
    Item ID: 25663
    Depositing User: Ainhoa Llorente
    Date Deposited: 20 Dec 2010 08:36
    Last Modified: 21 Dec 2010 17:07
    URI: http://oro.open.ac.uk/id/eprint/25663
    Repository Staff Only: edit this item
    Public: Report issue/request change

    Policies | Disclaimer

    © The Open University   + 44 (0)870 333 4340   general-enquiries@open.ac.uk