Franz Gotz-Hahn
KonVid-150k: a dataset for no-reference video quality assessment of videos in-the-wild.
Gotz-Hahn, Franz; Hosu, Vlad; Lin, Hanhe; Saupe, Dietmar
Authors
Vlad Hosu
Hanhe Lin
Dietmar Saupe
Abstract
Video quality assessment (VQA) methods focus on particular degradation types, usually artificially induced on a small set of reference videos. Hence, most traditional VQA methods under-perform in-the-wild. Deep learning approaches have had limited success due to the small size and diversity of existing VQA datasets, either artificial or authentically distorted. We introduce a new in-the-wild VQA dataset that is substantially larger and diverse: KonVid-150k. It consists of a coarsely annotated set of 153,841 videos having five quality ratings each, and 1,596 videos with a minimum of 89 ratings each. Additionally, we propose new efficient VQA approaches (MLSP-VQA) relying on multi-level spatially pooled deep-features (MLSP). They are exceptionally well suited for training at scale, compared to deep transfer learning approaches. Our best method, MLSP-VQA-FF, improves the Spearman rank-order correlation coefficient (SRCC) performance metric on the commonly used KoNViD-1k in-the-wild benchmark dataset to 0.82. It surpasses the best existing deep-learning model (0.80 SRCC) and hand-crafted feature-based method (0.78 SRCC). We further investigate how alternative approaches perform under different levels of label noise, and dataset size, showing that MLSP-VQA-FF is the overall best method for videos in-the-wild. Finally, we show that the MLSP-VQA models trained on KonVid-150k sets the new state-of-the-art for cross-test performance on KoNViD-1k and LIVE-Qualcomm with a 0.83 and 0.64 SRCC, respectively. For KoNViD-1k this inter-dataset testing outperforms intra-dataset experiments, showing excellent generalization.
Citation
GÖTZ-HAHN, F., HOSU, V., LIN, H. and SAUPE, D. 2021. KonVid-150k: a dataset for no-reference video quality assessment of videos in-the-wild. IEEE access [online], 9, pages 72139-72160. Available from: https://doi.org/10.1109/access.2021.3077642
Journal Article Type | Article |
---|---|
Acceptance Date | Apr 12, 2021 |
Online Publication Date | May 5, 2021 |
Publication Date | Dec 31, 2021 |
Deposit Date | May 3, 2022 |
Publicly Available Date | May 3, 2022 |
Journal | IEEE Access |
Electronic ISSN | 2169-3536 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 9 |
Pages | 72139-72160 |
DOI | https://doi.org/10.1109/access.2021.3077642 |
Keywords | Datasets; Deep transfer learning; Multi-level spatially-pooled features; Video quality assessment; Video quality dataset |
Public URL | https://rgu-repository.worktribe.com/output/1580723 |
Files
GÖTZ-HAHN 2021 KonVid-150k (VOR)
(2.9 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
You might also like
TranSalNet: towards perceptually relevant visual saliency prediction.
(2022)
Journal Article
Large-scale crowdsourced subjective assessment of picturewise just noticeable difference.
(2022)
Journal Article
Subjective image quality assessment with boosted triplet comparisons.
(2021)
Journal Article
Helmet use detection of tracked motorcycles using CNN-based multi-task learning.
(2020)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search