Quarterly Publication

Document Type : Original Article

Author

University of Agriculture, Pakistan.

10.22105/bdcv.2023.415895.1165

Abstract

Video Quality Assessment (VQA) is a critical component of various technologies, including automated video broadcasting through displaying technologies. Moreover, determining visual quality necessitates a balanced examination of visual features and functionality. Previous research has also shown that features derived from pre-trained models of Convolutional Neural Networks (CNNs) are extremely useful in various image analysis and computer vision activities. Based on characteristics collected from pre-trained models of deep neural networks, transfer learning, periodic pooling, and regression, we created a unique architecture for No Reference Video Quality Assessment (NR-VQA) in this research. We were able to get results by solely employing dynamically pooled deep features and avoiding the use of manually produced features. This study describes a novel, deep learning-based strategy for NR-VQA that uses several pre-trained deep neural networks to characterize probable image and video distortions across parallel. A set of pre-trained CNNs extract spatially pooling and intensity-adjusted video-level feature representations, which are then individually mapped onto subjective peer assessments. Ultimately, the perceived quality of a video series is calculated by combining the quality standards from the various regressors. Numerous researches demonstrate that the suggested approach on two large baseline video quality analysis datasets with realistic aberrations sets a new state-of-the-art. Furthermore, the findings show that combining the decisions of different deep networks can greatly improve NR-VQA.

Keywords

[1]     Li, X., & Qiu, J. (2021). A multi-parameter video quality assessment model based on 3D convolutional neural network on the cloud. ASP transactions on internet of things, 1(2), 14–22.
[2]     Bianco, S., Celona, L., Napoletano, P., & Schettini, R. (2018). On the use of deep learning for blind image quality assessment. Signal, image and video processing, 12, 355–362.
[3]     Varga, D. (2019). No-reference video quality assessment based on the temporal pooling of deep features. Neural processing letters, 50(3), 2595–2608.
[4]     Chen, P., Li, L., Wu, J., Dong, W., & Shi, G. (2021). Contrastive self-supervised pre-training for video quality assessment. IEEE transactions on image processing, 31, 458–471.
[5]     Varga, D. (2022). No-reference video quality assessment using multi-pooled, saliency weighted deep features and decision fusion. Sensors, 22(6), 2209.
[6]     Hong, C., Chen, X., Wang, X., & Tang, C. (2016). Hypergraph regularized autoencoder for image-based 3D human pose recovery. Signal processing, 124, 132–140.
[7]     Xue, J., Yin, L., Lan, Z., Long, M., Li, G., Wang, Z., & Xie, X. (2021). 3D DCT based image compression method for the medical endoscopic application. Sensors, 21(5), 1817.
[8]     Hosu, V., Hahn, F., Jenadeleh, M., Lin, H., Men, H., Szirányi, T., … & Saupe, D. (2017). The konstanz natural video database (konvid-1k). 2017 ninth international conference on quality of multimedia experience (qomex) (pp. 1–6). IEEE. https://doi.org/10.1109/QoMEX.2017.7965673
[9]     Huynh, B. Q., Li, H., & Giger, M. L. (2016). Digital mammographic tumor classification using transfer learning from deep convolutional neural networks. Journal of medical imaging, 3(3), 34501.
[10]   Vranješ, M., Rimac-Drlje, S., & Vranješ, D. (2018). Foveation-based content adaptive root mean squared error for video quality assessment. Multimedia tools and applications, 77, 21053–21082.
[11]   Koike, M., Urata, Y., & Yamagishi, K. (2022). Bitstream-quality-estimation model for tile-based VR video streaming services. IEICE transactions on communications, 105(8), 1002–1013.
[12]   Li, J., Zou, L., Yan, J., Deng, D., Qu, T., & Xie, G. (2016). No-reference image quality assessment using Prewitt magnitude based on convolutional neural networks. Signal, image and video processing, 10, 609–616.
[13]   Li, X., Guo, Q., & Lu, X. (2016). Spatiotemporal statistics for video quality assessment. IEEE transactions on image processing, 25(7), 3329–3342.
[14]   Li, Y., Po, L. M., Cheung, C. H., Xu, X., Feng, L., Yuan, F., & Cheung, K-W. (2015). No-reference video quality assessment with 3D shearlet transform and convolutional neural networks. IEEE transactions on circuits and systems for video technology, 26(6), 1044–1057.
[15]   Villaret, M., & others. (2021). Efficient fundus image gradeability approach based on deep reconstruction-classification network. Artificial intelligence research and development: proceedings of the 23rd international conference of the catalan association for artificial intelligence (Vol. 339, p. 402). IOS Press. https://books.google.com/books?id=LYxJEAAAQBAJ&lr=&source=gbs_navlinks_s
[16]   Koike, M., Urata, Y., Egi, N., & Yamagishi, K. (2021). Extension of itu-t p. 1204.3 model to tile-based vr streaming services. 2021 ieee international workshop technical committee on communications quality and reliability (cqr 2021) (pp. 1–6). IEEE. https://doi.org/10.1109/CQR39960.2021.9446237
[17]   Saad, M. A., Bovik, A. C., & Charrier, C. (2011). DCT statistics model-based blind image quality assessment. 2011 18th ieee international conference on image processing (pp. 3093–3096). IEEE. https://doi.org/10.1109/ICIP.2011.6116319
[18]   Saad, M. A., Bovik, A. C., & Charrier, C. (2014). Blind prediction of natural video quality. IEEE transactions on image processing, 23(3), 1352–1365.
[19]   Saupe, D., Hahn, F., Hosu, V., Zingman, I., Rana, M., & Li, S. (2016). Crowd workers proven useful: a comparative study of subjective video quality assessment. QoMEX 2016: 8th international conference on quality of multimedia experience. konstanzer online-publikations-system (KOPS). http://nbn-resolving.de/urn:nbn:de:bsz:352-0-371921
[20]   Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85–117.