Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning

Authors

  • Jinsong Shi Nanjing University of Aeronautics and Astronautics
  • Pan Gao Nanjing University of Aeronautics and Astronautics
  • Jie Qin Nanjing University of Aeronautics and Astronautics

DOI:

https://doi.org/10.1609/aaai.v38i5.28285

Keywords:

CV: Low Level & Physics-based Vision, CV: Applications

Abstract

Image Quality Assessment (IQA) has long been a research hotspot in the field of image processing, especially No-Reference Image Quality Assessment (NR-IQA). Due to the powerful feature extraction ability, existing Convolution Neural Network (CNN) and Transformers based NR-IQA methods have achieved considerable progress. However, they still exhibit limited capability when facing unknown authentic distortion datasets. To further improve NR-IQA performance, in this paper, a novel supervised contrastive learning (SCL) and Transformer-based NR-IQA model SaTQA is proposed. We first train a model on a large-scale synthetic dataset by SCL (no image subjective score is required) to extract degradation features of images with various distortion types and levels. To further extract distortion information from images, we propose a backbone network incorporating the Multi-Stream Block (MSB) by combining the CNN inductive bias and Transformer long-term dependence modeling capability. Finally, we propose the Patch Attention Block (PAB) to obtain the final distorted image quality score by fusing the degradation features learned from contrastive learning with the perceptual distortion information extracted by the backbone network. Experimental results on six standard IQA datasets show that SaTQA outperforms the state-of-the-art methods for both synthetic and authentic datasets. Code is available at https://github.com/I2-Multimedia-Lab/SaTQA.

Published

2024-03-24

How to Cite

Shi, J., Gao, P., & Qin, J. (2024). Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), 4829-4837. https://doi.org/10.1609/aaai.v38i5.28285

Issue

Section

AAAI Technical Track on Computer Vision IV