Nowak, Stefanie; Lukashevich, Hanna; Dunker, Peter and Rüger, Stefan
Performance measures for multilabel evaluation.
In: The 11th ACM International Conference on Multimedia Information Retrieval (MIR), 29-31 Mar 2010, Philadelphia, PA, USA, p. 35.
Full text available as:
With the steadily increasing amount of multimedia documents on the web and at home, the need for reliable semantic indexing methods that assign multiple keywords to a document grows. The performance of existing approaches is often measured with standard evaluation measures of the information retrieval community. In a case study on image annotation, we show the behaviour of 13 different evaluation measures and point out their strengths and weaknesses. For the analysis, data from 19 research groups that participated in the ImageCLEF Photo Annotation Task are utilized together with several configurations based on random numbers. A recently proposed ontology-based measure was investigated that incorporates structure information, relationships from the ontology and the agreement between annotators for a concept and compared to a hierarchical variant. The results for the hierarchical measure are not competitive. The ontology-based results assign good scores to the systems that got also good ranks in the other measures like the example-based F-measure. For concept-based evaluation, stable results could be obtained for MAP concerning random numbers and the number of annotated labels. The AUC measure shows good evaluation characteristics in case all annotations contain confidence values.
||experimentation; performance; measurement
||Knowledge Media Institute
||06 Jan 2011 11:48
||26 Oct 2012 04:39
|Share this page:
Actions (login may be required)