Nowak, Stefanie; Lukashevich, Hanna; Dunker, Peter and Rüger, Stefan
Due to copyright restrictions, this file is not available for public download
Click here to request a copy from the OU Author.
|DOI (Digital Object Identifier) Link:||http://doi.org/10.1145/1743384.1743398|
|Google Scholar:||Look up in Google Scholar|
With the steadily increasing amount of multimedia documents on the web and at home, the need for reliable semantic indexing methods that assign multiple keywords to a document grows. The performance of existing approaches is often measured with standard evaluation measures of the information retrieval community. In a case study on image annotation, we show the behaviour of 13 different evaluation measures and point out their strengths and weaknesses. For the analysis, data from 19 research groups that participated in the ImageCLEF Photo Annotation Task are utilized together with several configurations based on random numbers. A recently proposed ontology-based measure was investigated that incorporates structure information, relationships from the ontology and the agreement between annotators for a concept and compared to a hierarchical variant. The results for the hierarchical measure are not competitive. The ontology-based results assign good scores to the systems that got also good ranks in the other measures like the example-based F-measure. For concept-based evaluation, stable results could be obtained for MAP concerning random numbers and the number of annotated labels. The AUC measure shows good evaluation characteristics in case all annotations contain confidence values.
|Item Type:||Conference Item|
|Copyright Holders:||2010 ACM|
|Keywords:||experimentation; performance; measurement|
|Academic Unit/Department:||Knowledge Media Institute|
|Depositing User:||Kay Dave|
|Date Deposited:||06 Jan 2011 11:48|
|Last Modified:||23 Mar 2016 11:27|
|Share this page:|