Min Li
MVVA-net: a video aesthetic quality assessment network with cognitive fusion of multi-type feature–based strong generalization.
Li, Min; Wang, Zheng; Ren, Jinchang; Sun, Meijun
Abstract
With the increasing popularity of short videos on various social media platforms, there is a great challenge for evaluating the aesthetic quality of these videos. In this paper, we first construct a large-scale and properly annotated short video aesthetics (SVA) dataset. We further propose a cognitive multi-type feature fusion network (MVVA-Net) for video aesthetic quality assessment. MVVA-Net consists of two branches: intra-frame aesthetics branch and inter-frame aesthetics branch. These two branches take different types of video frames as input. The inter-frame aesthetic branch extracts the inter-frame aesthetic features based on the sequential frames extracted at fixed intervals, and the intra-frame aesthetic branch extracts the intra-frame aesthetic features based on the key frames extracted by the inter-frame difference method. Through the adaptive fusion of inter-frame aesthetic features and intra-frame aesthetic features, the video aesthetic quality can be effectively evaluated. At the same time, MVVA-Net has no fixed number of input frames, which greatly enhances the generalization ability of the model. We performed quantitative comparison and ablation studies. The experimental results show that the two branches of MVVA-Net can effectively extract the intra-frame aesthetic features and inter-frame aesthetic features of different videos. Through the adaptive fusion of intra-frame aesthetic features and inter-frame aesthetic features for video aesthetic quality assessment, MVVA-Net achieves better classification performance and stronger generalization ability than other methods. In this paper, we construct a dataset of 6900 video shots and propose a video aesthetic quality assessment method based on non-fixed model input strategy and multi-type features. Experimental results show that the model has a strong generalization ability and achieved a good performance on different datasets.
Citation
LI, M., WANG, Z., REN, J. and SUN, M. 2022. MVVA-net: a video aesthetic quality assessment network with cognitive fusion of multi‑type feature–based strong generalization. Cognitive computation [online], 14(4), pages 1435-1445. Available from: https://doi.org/10.1007/s12559-021-09947-1
Journal Article Type | Article |
---|---|
Acceptance Date | Sep 29, 2021 |
Online Publication Date | Mar 12, 2022 |
Publication Date | Jul 31, 2022 |
Deposit Date | Jun 30, 2022 |
Publicly Available Date | Mar 13, 2023 |
Journal | Cognitive Computation |
Print ISSN | 1866-9956 |
Electronic ISSN | 1866-9964 |
Publisher | Springer |
Peer Reviewed | Peer Reviewed |
Volume | 14 |
Issue | 4 |
Pages | 1435-1445 |
DOI | https://doi.org/10.1007/s12559-021-09947-1 |
Keywords | Videos; Social media platforms; Aesthetic quality; Short video aesthetics (SVA); Multi-type feature fusion network (MVVA-Net) |
Public URL | https://rgu-repository.worktribe.com/output/1628644 |
Related Public URLs | https://rgu-repository.worktribe.com/output/1628665 |
Files
LI 2022 MVVA-net (AAM)
(859 Kb)
PDF
Copyright Statement
This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s12559-021-09947-1
You might also like
Two-click based fast small object annotation in remote sensing images.
(2024)
Journal Article
Prompting-to-distill semantic knowledge for few-shot learning.
(2024)
Journal Article
Detection-driven exposure-correction network for nighttime drone-view object detection.
(2024)
Journal Article
Feature aggregation and region-aware learning for detection of splicing forgery.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search