Skip to main content

Research Repository

Advanced Search

Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval.

Wang, Lei; Song, Dawei; Elyan, Eyad

Authors

Lei Wang

Dawei Song



Abstract

Most of the state-of-art approaches to Query-by-Example (QBE) video retrieval are based on the Bag-of-visual-Words (BovW) representation of visual content. It, however, ig- nores the spatial-temporal information, which is important for similarity measurement between videos. Direct incorpo- ration of such information into the video data representa- tion for a large scale data set is computationally expensive in terms of storage and similarity measurement. It is also static regardless of the change of discriminative power of vi- sual words for di erent queries. To tackle these limitations, in this paper, we propose to discover Spatial-Temporal Cor- relations (STC) imposed by the query example to improve the BovW model for video retrieval. The STC, in terms of spatial proximity and relative motion coherence between dif- ferent visual words, is crucial to identify the discriminative power of the visual words. We develop a novel technique to emphasize the most discriminative visual words for similar- ity measurement, and incorporate this STC-based approach into the standard inverted index architecture. Our approach is evaluated on the TRECVID2002 and CC WEB VIDEO datasets for two typical QBE video retrieval tasks respec- tively. The experimental results demonstrate that it sub- stantially improves the BovW model as well as a state of the art method that also utilizes spatial-temporal informa- tion for QBE video retrieval.

Citation

WANG, L., SONG, D. and ELYAN, E. 2012. Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval. In Proceedings of the 21st Association for Computing Machinery (ACM) international conference on information and knowledge management (CIKM'12), 29 October - 2 November 2012, Maui, USA. New York: ACM [online], pages 1303-1312. Available from: https://doi.org/10.1145/2396761.2398433

Conference Name 21st Association for Computing Machinery (ACM) international conference on information and knowledge management (CIKM'12)
Start Date Oct 29, 2012
End Date Nov 2, 2012
Acceptance Date Oct 31, 2012
Online Publication Date Oct 31, 2012
Publication Date Dec 31, 2012
Deposit Date Jan 21, 2015
Publicly Available Date Jan 21, 2015
Publisher Association for Computing Machinery
Pages 1303-1312
DOI https://doi.org/10.1145/2396761.2398433
Keywords Spatial; Temporal; Correlation; Discriminative visual word; Content based ; Video ; Retrieval; Query by Example ; Bag of visual; Word
Public URL http://hdl.handle.net/10059/1129

Files





You might also like



Downloadable Citations