This file contains only basic metadata and links to where the website can be accessed. As a result,the licence under which this file is shared on OpenAIR is not necessarily the same as the licence used for the website content itself. Please consult the terms and conditions of use for the website directly. GENERAL INFORMATION 1. Title of Dataset: MVVA-net: a video aesthetic quality assessment network with cognitive fusion of multi-type feature–based strong generalization. [Dataset]. 2. Contributor information: Min Li (Tianjin University) Zheng Wang (Tianjin University) Jinchang Ren (Robert Gordon University) Meijun Sun (Tianjin University) 3. Date on which website last updated: 2021-07-08 4. Funding: Science and Technology Support Program, Yunnan Provance [202002AD080001] National Natural Science Foundation of China [61772360; 16876125; 62076180] ACCESS INFORMATION 1. Access Links: GitHub: https://github.com/Lm0324/MVVA-Net Internet Archive (Version dated 2022-07-01) https://web.archive.org/web/20220701141505/https://github.com/Lm0324/MVVA-Net 2. Recommended citation: LI, M., WANG, Z., REN, J. and SUN, M. 2022. MVVA-net: a video aesthetic quality assessment network with cognitive fusion of multi-type feature–based strong generalization. [Dataset]. Hosted on GitHub [online]. Available from: https://github.com/Lm0324/MVVA-Net CONTEXTUAL INFORMATION 1. Abstract: MVVA-Net contains two branches: the intra-frame aesthetic branch and the inter-frame aesthetic branch. The intra-frame aesthetics branch extracts the intra-frame aesthetic features of a single frame through the VGG-16 convolution structure and multi-receptive field fusion module (MFF) and merges the intra-frame aesthetic features extracted from all frames; the inter-frame aesthetics branch uses the VGG-16 convolution structure and LSTM extracts the inter-frame aesthetic features between every two frames and merges the inter-frame aesthetic features extracted from all frames. MVFF adaptively fuses the intra-frame aesthetic features and inter-frame aesthetic features, and the fused features are mapped to one dimension through the full connection layer to represent the aesthetic quality of the video.