Vision-language Assisted Attribute Learning

Liang, Kongming; Wang, Xinran; Wang, Rui; Gao, Donghui; Jin, Ling; Liu, Weidong; Zhu, Xiatian; Ma, Zhanyu; Guo, Jun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2312.07009 (cs)

[Submitted on 12 Dec 2023 (v1), last revised 15 Dec 2023 (this version, v2)]

Title:Vision-language Assisted Attribute Learning

Authors:Kongming Liang, Xinran Wang, Rui Wang, Donghui Gao, Ling Jin, Weidong Liu, Xiatian Zhu, Zhanyu Ma, Jun Guo

View PDF HTML (experimental)

Abstract:Attribute labeling at large scale is typically incomplete and partial, posing significant challenges to model optimization. Existing attribute learning methods often treat the missing labels as negative or simply ignore them all during training, either of which could hamper the model performance to a great extent. To overcome these limitations, in this paper we leverage the available vision-language knowledge to explicitly disclose the missing labels for enhancing model learning. Given an image, we predict the likelihood of each missing attribute label assisted by an off-the-shelf vision-language model, and randomly select to ignore those with high scores in training. Our strategy strikes a good balance between fully ignoring and negatifying the missing labels, as these high scores are found to be informative on revealing label ambiguity. Extensive experiments show that our proposed vision-language assisted loss can achieve state-of-the-art performance on the newly cleaned VAW dataset. Qualitative evaluation demonstrates the ability of the proposed method in predicting more complete attributes.

Comments:	Accepted by IEEE IC-NIDC 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2312.07009 [cs.CV]
	(or arXiv:2312.07009v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2312.07009

Submission history

From: Xinran Wang [view email]
[v1] Tue, 12 Dec 2023 06:45:19 UTC (1,883 KB)
[v2] Fri, 15 Dec 2023 02:40:29 UTC (1,883 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Vision-language Assisted Attribute Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Vision-language Assisted Attribute Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators