Llorente, Ainhoa; Motta, Enrico and Rüger, Stefan (2009). Exploring the semantics behind a collection to improve automated image annotation. In: 10th Workshop of the Cross-Language Evaluation Forum (CLEF 2009), 30 Sep - 2 Oct 2009, Corfu, Greece.
Abstract
The goal of this research is to explore several semantic relatedness measures that help to refine annotations generated by a baseline non-parametric density estimation algorithm. Thus, we analyse the benefits of performing a statistical correlation using the training set or using the World Wide Web versus approaches based on a thesaurus like WordNet or Wikipedia (considered as a hyperlink structure). Experiments are carried out using the dataset provided by the 2009 edition of the ImageCLEF competition, a subset of the MIR-Flickr 25k collection. Best results correspond to approaches based on statistical correlation as they do not depend on a prior disambiguation phase like WordNet and Wikipedia. Further work needs to be done to assess whether proper disambiguation schemas might improve their performance.