The Open UniversitySkip to content

Improving Bag-of-visual-Words model with spatial-temporal correlation for video retrieval

Wang, Lei; Song, Dawei and Elyan, Eyad (2012). Improving Bag-of-visual-Words model with spatial-temporal correlation for video retrieval. In: 21st ACM Conference on Information and Knowledge Management (CIKM 2012), 30 Oct - 1 Nov 2012, Hawaii, USA.

Full text available as:
PDF (Version of Record) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (2MB) | Preview
Google Scholar: Look up in Google Scholar


Most of the state-of-art approaches to Query-by-Example (QBE) video retrieval are based on the Bag-of-visual-Words (BovW) representation of visual content. It, however, ig- nores the spatial-temporal information, which is important for similarity measurement between videos. Direct incorpo- ration of such information into the video data representa- tion for a large scale data set is computationally expensive in terms of storage and similarity measurement. It is also static regardless of the change of discriminative power of vi- sual words with respect to di↵erent queries. To tackle these limitations, in this paper, we propose to discover Spatial- Temporal Correlations (STC) imposed by the query exam- ple to improve the BovW model for video retrieval. The STC, in terms of spatial proximity and relative motion co- herence between di↵erent visual words, is crucial to identify the discriminative power of the visual words. We develop a novel technique to emphasize the most discriminative visual words for similarity measurement, and incorporate this STC-based approach into the standard inverted index archi- tecture. Our approach is evaluated on the TRECVID2002 and CC WEB VIDEO datasets for two typical QBE video retrieval tasks respectively. The experimental results demon- strate that it substantially improves the BovW model as well as a state of the art method that also utilizes spatial- temporal information for QBE video retrieval.

Item Type: Conference or Workshop Item
Copyright Holders: 2012 ACM
Extra Information: ISBN: 978-1-4503-1156-4
Keywords: spatial-temporal correlation; content based video retrieval; query-by-example; Bag-of-Visual-Words
Academic Unit/School: Faculty of Science, Technology, Engineering and Mathematics (STEM)
Faculty of Science, Technology, Engineering and Mathematics (STEM) > Computing and Communications
Related URLs:
Item ID: 34678
Depositing User: Dawei Song
Date Deposited: 16 Oct 2012 12:40
Last Modified: 07 Dec 2018 23:09
Share this page:

Download history for this item

These details should be considered as only a guide to the number of downloads performed manually. Algorithmic methods have been applied in an attempt to remove automated downloads from the displayed statistics but no guarantee can be made as to the accuracy of the figures.

Actions (login may be required)

Policies | Disclaimer

© The Open University   contact the OU