Subjective video quality assessment (VQA) strongly depends on semantics, context, and the types of visual distortions. A lot of existing VQA databases cover small numbers of video sequences with artificial distortions. When testing newly developed Quality of Experience (QoE) models and metrics, they are commonly evaluated against subjective data from such databases, that are the result of perception experiments. However, since the aim of these QoE models is to accurately predict natural videos, these artificially distorted video databases are an insufficient basis for learning. Additionally, the small sizes make them only marginally usable for state-of-the-art learning systems, such as deep learning. In order to give a better basis for development and evaluation of objective VQA methods, we have created a larger datasets of natural, real-world video sequences with corresponding subjective mean opinion scores (MOS) gathered through crowdsourcing.

We took YFCC100m as a baseline database, consisting of 793436 Creative Commons (CC) video sequences, filtered them through multiple steps to ensure that the video sequences are representative of the whole spectrum of available video content, types of distortions, and subjective quality. The resulting 1200 videos are available to download, alongside the subjective data and evaluation of the best-performing techniques available for multiple video attributes. Namely, we have evaluated blur, colorfulness, contrast, spatial information, temporal information and video quality.

The KoNViD-1k data is publicly available to the research community. Please cite the following references if you use this database in your research:

  • V. Hosu, F. Hahn, M. Jenadeleh, H. Lin, H. Men, T. Szirányi, S. Li and D. Saupe, "The Konstanz Natural Video Database" http://database.mmsp-kn.de

  • V. Hosu, F. Hahn, M. Jenadeleh, H. Lin, H. Men, T. Szirányi, S. Li and D. Saupe, "The Konstanz Natural Video Database (KoNViD-1k)", Quality of Multimedia Experience (QoMEX), 2017 Nineth International Conference on. IEEE, 2017. LINK


Video Data: KoNViD-1k 8s video sequences LINK
Subjective Data: KoNViD-1k crowdsourcing data LINK
Aggregated MOS Values: LINK
Attribute Evaluation Data: KoNViD-1k video attributes LINK

The original 30s video sequences and per-frame video attribute evaluations can be shared upon request.


The main challenge in applying state-of-the-art deep learning methods to predict image quality in-the-wild is the relatively small size of existing quality scored datasets. The reason for the lack of larger datasets is the massive resources required in generating diverse and publishable content. To this purpose, we have created a large IQA database of natural, real-world images with corresponding mean opinion scores (MOS) gathered through crowdsourcing.

We sampled 10,073 images from 1 million YFCC100m images by enforcing a roughly uniform distribution across seven quality indicators and one content indicator. We performed very large scale crowdsourcing experiments on 10,073 images, obtained 1.2 million ratings from 1,467 crowd workers.


The KonIQ-10k data is publicly available to the research community. If you use our database in your research, we kindly ask that you cite our website listed below.

  • @misc{koniq10k,
    title = {{KonIQ-10K}: Towards an ecologically valid and large-scale {IQA} database},
    author = {Lin, Hanhe and Hosu, Vlad and Saupe, Dietmar},
    year = {2018},
    journal = {arXiv preprint arXiv:1803.08480},


Download the whole database here: LINK

Please contact Vlad Hosu (vlad.hosu@uni-konstanz.de) or Hanhe Lin (hanhe.lin@uni-konstanz.de) if you have any questions.


This website is hosted by the Multimedia Signal Processing Group, University of Konstanz, Germany. More datasets will be publishet here in the future.