Lei Wang
Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval.
Wang, Lei; Song, Dawei; Elyan, Eyad
Abstract
Most of the state-of-art approaches to Query-by-Example (QBE) video retrieval are based on the Bag-of-visual-Words (BovW) representation of visual content. It, however, ig- nores the spatial-temporal information, which is important for similarity measurement between videos. Direct incorpo- ration of such information into the video data representa- tion for a large scale data set is computationally expensive in terms of storage and similarity measurement. It is also static regardless of the change of discriminative power of vi- sual words for di erent queries. To tackle these limitations, in this paper, we propose to discover Spatial-Temporal Cor- relations (STC) imposed by the query example to improve the BovW model for video retrieval. The STC, in terms of spatial proximity and relative motion coherence between dif- ferent visual words, is crucial to identify the discriminative power of the visual words. We develop a novel technique to emphasize the most discriminative visual words for similar- ity measurement, and incorporate this STC-based approach into the standard inverted index architecture. Our approach is evaluated on the TRECVID2002 and CC WEB VIDEO datasets for two typical QBE video retrieval tasks respec- tively. The experimental results demonstrate that it sub- stantially improves the BovW model as well as a state of the art method that also utilizes spatial-temporal informa- tion for QBE video retrieval.
Citation
WANG, L., SONG, D. and ELYAN, E. 2012. Improving bag-of-visual-words model with spatial-temporal correlation for video retrieval. In Proceedings of the 21st Association for Computing Machinery (ACM) international conference on information and knowledge management (CIKM'12), 29 October - 2 November 2012, Maui, USA. New York: ACM [online], pages 1303-1312. Available from: https://doi.org/10.1145/2396761.2398433
Conference Name | 21st Association for Computing Machinery (ACM) international conference on information and knowledge management (CIKM'12) |
---|---|
Start Date | Oct 29, 2012 |
End Date | Nov 2, 2012 |
Acceptance Date | Oct 31, 2012 |
Online Publication Date | Oct 31, 2012 |
Publication Date | Dec 31, 2012 |
Deposit Date | Jan 21, 2015 |
Publicly Available Date | Jan 21, 2015 |
Publisher | Association for Computing Machinery |
Pages | 1303-1312 |
DOI | https://doi.org/10.1145/2396761.2398433 |
Keywords | Spatial; Temporal; Correlation; Discriminative visual word; Content based ; Video ; Retrieval; Query by Example ; Bag of visual; Word |
Public URL | http://hdl.handle.net/10059/1129 |
Files
WANG 2012 Improving bag-of-visual-words
(2 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Predicting emotional reaction in social networks.
(2017)
Conference Proceeding
Early fusion and query modification in their dual late fusion forms.
(2015)
Journal Article
You have e-mail, what happens next?
(2013)
Journal Article