ABSTRACT
Assessing visual aesthetics is important for organizing and retrieving photos. That is one reason why several works aim to automate such an assessment using deep neural networks. The underlying models, however, lack explainability. Due to the subjective nature of aesthetics, it is challenging to find objective ground truths and explanations for aesthetics based on them. Hence, such models are prone to socio-cultural biases that come with the data, which raises questions on a wide range of ethical and technical issues. This paper presents an explainable artificial intelligence framework that adapts and combines three types of explanations for the concept of aesthetic assessment: 1) model constraints for built-in interpretability, 2) analysis of perturbation impacts on decisions, and 3) generation of artificial images that represent maxima or minima of values in the latent feature space. The objective is to improve human understanding through the explanations by creating an intuition for the model’s decision making. We identify issues that arise when humans interact with the explanations and derive requirements from human feedback to address the needs of different user groups. We evaluate our novel interactive explainable artificial intelligence technology in a study with end users (N=20). Our participants have different levels of experience in deep learning, allowing us to include experts, intermediate users, and laypersons. Our results show the benefits of the interactivity of our approach. All users found our system helpful in understanding how the aesthetic assessment was executed, reporting varying needs for explanatory details.
Supplemental Material
Available for Download
- L. Abdenebaoui, B. Meyer, A. Bruns, and S. Boll. 2018. UNNA: A Unified Neural Network for Aesthetic Assessment. In 2018 International Conference on Content-Based Multimedia Indexing (CBMI). 1–6. https://doi.org/10.1109/CBMI.2018.8516273Google Scholar
- Gulsum Alicioglu and Bo Sun. 2021. A survey of visual analytics for Explainable Artificial Intelligence methods. Computers & Graphics (Sept. 2021), S0097849321001886. https://doi.org/10.1016/j.cag.2021.09.002Google ScholarDigital Library
- Virginia Braun and Victoria Clarke. 2008. Using thematic analysis in psychology. Qualitative Research in Psychology (July 2008). https://www.tandfonline.com/doi/abs/10.1191/1478088706qp063oaGoogle Scholar
- Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Weinberger. 2017. Densely Connected Convolutional Networks. 4700–4708. https://openaccess.thecvf.com/content_cvpr_2017/html/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.htmlGoogle Scholar
- Dhiraj Joshi, Ritendra Datta, Elena Fedorovskaya, Quang-Tuan Luong, James Z. Wang, Jia Li, and Jiebo Luo. 2011. Aesthetics and Emotions in Images. IEEE Signal Processing Magazine 28, 5 (Sept. 2011), 94–115. https://doi.org/10.1109/MSP.2011.941851Google ScholarCross Ref
- Chen Kang, Giuseppe Valenzise, and Frédéric Dufaux. 2020. EVA: An Explainable Visual Aesthetics Dataset. In Joint Workshop on Aesthetic and Technical Quality Assessment of Multimedia and Media Analytics for Societal Trends(ATQAM/MAST’20). Association for Computing Machinery, Seattle, WA, USA, 5–13. https://doi.org/10.1145/3423268.3423590Google ScholarDigital Library
- Diederik P. Kingma and Jimmy Ba. 2017. Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR) (Jan. 2017). http://arxiv.org/abs/1412.6980 arXiv:1412.6980.Google Scholar
- Spencer C. Kohn, Ewart J. de Visser, Eva Wiese, Yi-Ching Lee, and Tyler H. Shaw. 2021. Measurement of Trust in Automation: A Narrative Review and Reference Guide. Frontiers in Psychology 12 (2021). https://www.frontiersin.org/article/10.3389/fpsyg.2021.604977Google Scholar
- Shu Kong, Xiaohui Shen, Zhe Lin, Radomir Mech, and Charless Fowlkes. 2016. Photo Aesthetics Ranking Network with Attributes and Content Adaptation. In Computer Vision - ECCV 2016(Lecture Notes in Computer Science), Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). Springer International Publishing, Cham, 662–679. https://doi.org/10.1007/978-3-319-46448-0_40Google Scholar
- Justin Kruger and David Dunning. 1999. Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments.Journal of Personality and Social Psychology 77, 6 (1999), 1121–1134. https://doi.org/10.1037/0022-3514.77.6.1121Google Scholar
- Clayton Lewis. 1982. Using the “thinking-aloud method in cognitive interface design.Technical Report. IBM TJ Watson Research Center, Yorktown Heights, NY.Google Scholar
- Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris Kotsiantis. 2021. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy 23, 1 (Jan. 2021), 18. https://doi.org/10.3390/e23010018Google Scholar
- Gautam Malu, Raju S. Bapi, and Bipin Indurkhya. 2017. Learning Photography Aesthetics with Deep CNNs. 28th Modern Artificial Intelligence and Cognitive Science Conference (July 2017). http://arxiv.org/abs/1707.03981 arXiv:1707.03981.Google Scholar
- Alexander Mordvintsev, Nicola Pezzotti, Ludwig Schubert, and Chris Olah. 2018. Differentiable Image Parameterizations. Distill 3, 7 (July 2018), e12. https://doi.org/10.23915/distill.00012Google ScholarCross Ref
- Chris Olah, Alexander Mordvintsev, and Ludwig Schubert. 2017. Feature Visualization. Distill 2, 11 (Nov. 2017), e7. https://doi.org/10.23915/distill.00007Google ScholarCross Ref
- Richard L. Oliver. 1977. Effect of expectation and disconfirmation on postexposure product evaluations: An alternative interpretation.Journal of Applied Psychology 62, 4 (1977), 480–486. https://doi.org/10.1037/0021-9010.62.4.480Google Scholar
- Wojciech Samek, Grégoire Montavon, Andrea Vedaldi, Lars Kai Hansen, and Klaus-Robert Müller (Eds.). 2019. Explainable AI: interpreting, explaining and visualizing deep learning. Number 11700 in Lecture notes in computer science. Springer, Cham.Google Scholar
- Kekai Sheng, Weiming Dong, Chongyang Ma, Xing Mei, Feiyue Huang, and Bao-Gang Hu. 2018. Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment. In Proceedings of the 26th ACM international conference on Multimedia. ACM, Seoul Republic of Korea, 879–886. https://doi.org/10.1145/3240508.3240554Google ScholarDigital Library
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research 15, 56 (2014), 1929–1958. http://jmlr.org/papers/v15/srivastava14a.htmlGoogle ScholarDigital Library
- Hossein Talebi and Peyman Milanfar. 2018. NIMA: Neural Image Assessment. IEEE Transactions on Image Processing 27, 8 (Aug. 2018), 3998–4011. https://doi.org/10.1109/TIP.2018.2831899Google ScholarCross Ref
- Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In International Conference on Machine Learning. PMLR, 6105–6114. http://proceedings.mlr.press/v97/tan19a.htmlGoogle Scholar
- Heather M. Wojton, Daniel Porter, Stephanie T. Lane, Chad Bieber, and Poornima Madhavan. 2020. Initial validation of the trust of automated systems test (TOAST). The Journal of Social Psychology 160, 6 (Nov. 2020), 735–750. https://doi.org/10.1080/00224545.2020.1749020Google ScholarCross Ref
- Xiaoran Wu. 2022. Interpretable Aesthetic Analysis Model for Intelligent Photography Guidance Systems. In 27th International Conference on Intelligent User Interfaces(IUI ’22). Association for Computing Machinery, New York, NY, USA, 661–671. https://doi.org/10.1145/3490099.3511155Google ScholarDigital Library
- Yan Ke, Xiaoou Tang, and Feng Jing. 2006. The Design of High-Level Features for Photo Quality Assessment. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1. 419–426. https://doi.org/10.1109/CVPR.2006.303 ISSN: 1063-6919.Google ScholarDigital Library
- Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convolutional Networks. In Computer Vision – ECCV 2014(Lecture Notes in Computer Science), David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 818–833. https://doi.org/10.1007/978-3-319-10590-1_53Google Scholar
Index Terms
- Explaining Image Aesthetics Assessment: An Interactive Approach
Recommendations
Explaining AI: fairly? well?
IUI '20: Proceedings of the 25th International Conference on Intelligent User InterfacesExplainable AI (XAI) has started experiencing explosive growth, echoing the explosive growth that has preceded it of AI becoming used for practical purposes that impact the general public. This spread of AI into the world outside of research labs brings ...
Explaining recommendation system using counterfactual textual explanations
AbstractCurrently, there is a significant amount of research being conducted in the field of artificial intelligence to improve the explainability and interpretability of deep learning models. It is found that if end-users understand the reason for the ...
What Are People Doing About XAI User Experience? A Survey on AI Explainability Research and Practice
Design, User Experience, and Usability. Design for Contemporary Interactive EnvironmentsAbstractExplainability is a hot topic nowadays for artificial intelligent (AI) systems. The role of machine learning (ML) models on influencing human decisions shed light on the back-box of computing systems. AI based system are more than just ML models. ...
Comments