Kaliciak, Leszek; Song, Dawei; Wiratunga, Nirmalie and Pan, Jeff
Combining visual and textual systems within the context of user feedback.
In: 19th International Conference on MultiMedia Modeling (MMM2013), 5-7 January, 2013, Huangshan, China, pp. 445–455.
Full text available as:
It has been proven experimentally, that a combination of textual and visual representations can improve the retrieval performance (, ). It is due to the fact, that the textual and visual feature spaces often represent complementary yet correlated aspects of the same image, thus forming a composite system.
In this paper, we present a model for the combination of visual and textual sub-systems within the user feedback context. The model was inspired by the measurement utilized in quantum mechanics (QM) and the tensor product of co-occurrence (density) matrices, which represents a density matrix of the composite system in QM. It provides a sound and natural framework to seamlessly integrate multiple feature spaces by considering them as a composite system, as well as a new way of measuring the relevance of an image with respect to a context. The proposed approach takes into account both intra (via co-occurrence matrices) and inter (via tensor operator) relationships between features’ dimensions. It is also computationally cheap and scalable to large data collections. We test our approach on ImageCLEF2007photo data collection and present interesting findings.
Actions (login may be required)