Learning Concise and Descriptive Attributes for Visual Recognition

Yan, An; Wang, Yu; Zhong, Yiwu; Dong, Chengyu; He, Zexue; Lu, Yujie; Wang, William; Shang, Jingbo; McAuley, Julian

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.03685 (cs)

[Submitted on 7 Aug 2023]

Title:Learning Concise and Descriptive Attributes for Visual Recognition

Authors:An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley

View PDF

Abstract:Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes. Pioneering work shows that querying thousands of attributes can achieve performance competitive with image features. However, our further investigation on 8 datasets reveals that LLM-generated attributes in a large quantity perform almost the same as random words. This surprising finding suggests that significant noise may be present in these attributes. We hypothesize that there exist subsets of attributes that can maintain the classification performance with much smaller sizes, and propose a novel learning-to-search method to discover those concise sets of attributes. As a result, on the CUB dataset, our method achieves performance close to that of massive LLM-generated attributes (e.g., 10k attributes for CUB), yet using only 32 attributes in total to distinguish 200 bird species. Furthermore, our new paradigm demonstrates several additional benefits: higher interpretability and interactivity for humans, and the ability to summarize knowledge for a recognition task.

Comments:	ICCV 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2308.03685 [cs.CV]
	(or arXiv:2308.03685v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.03685

Submission history

From: An Yan [view email]
[v1] Mon, 7 Aug 2023 16:00:22 UTC (14,477 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Concise and Descriptive Attributes for Visual Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Concise and Descriptive Attributes for Visual Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators