research-article

Food Image Recognition Method Based on Generative Self-supervised Learning

Authors:
Shan Zhu

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

0000-0003-1419-2245
View Profile

,
Xufeng Ling

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

0000-0001-5217-3614
View Profile

,
Kui Zhang

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

0009-0005-8194-674X
View Profile

,
Jiachao Niu

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

School of Artificial Intelligence, Shanghai Normal University Tianhua College, China

0009-0002-7864-7018
View Profile

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial IntelligenceMarch 2023Pages 203–207https://doi.org/10.1145/3594315.3594644

Published:02 August 2023Publication History

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

Pages 203–207

ABSTRACT

The demand of social life for automatic recognition of food images is increasing. Food images have the characteristics of diverse forms, small differences between classes and large differences within classes, which has the problem of high recognition difficulty. This paper proposes a food image recognition method based on generative self-supervised learning. Firstly, we use a BEiT based pre-training model which is trained through generative self-monitoring learning method as the feature extraction network to extract the global semantics and local detail features of food images. And then we fine-tune the fully connected network MLP for classification and recognition through supervised learning method. The model is tested on the current mainstream public food image dataset Food-101, and the top-1 accuracy of 85.99% is obtained. The experimental results show that this method can significantly reduce the computation of pixel level expression as well as extract the global and detailed features of the image, achieving quite good food image classification and recognition effect. Our method has good robustness, generalization and flexibility, which has practical application value.

References

Hongsheng He, Fangyu Kong, and Jindong Tan. DietCam: multiview food recognition using a multikernel SVM. IEEE Journal of Biomedical and Health Informatics 20, 3 (2016), 848-855.Google ScholarCross Ref
Mei-yun Chen, Yung-hsiangYang, Chia-Ju Ho, Shih-Han Wang, Shane-Ming Liu, Eugene Chang, Che-Hua Yeh, and Ming Ouhyoung. Automatic Chinese food identification and quantity estimation. In Proceedings of SA' 12 SIGGRAPH Asia 2012 Technical Briefs. Singapore: ACM (2012).Google Scholar
Niki Martinel, Claudio Piciarelli, and Christian Micheloni. A supervised extreme learning committee for food recognition. Computer Vision and Image Understanding 148 (2016), 67-86.Google ScholarCross Ref
Huagang Liang, Xiaoqian Wen, Dandan Liang, Huaide Li, and Feng Ru. Fine-grained food image recognition of a multi-level convolution feature pyramid. Journal of Image and Graphics, 2019, 24(06): 0870-0881.Google Scholar
Zhiliang Deng and Lei Li. Chinese food recognition model based on improved residual network. Progress in Laser and Optoelectronics 58, 6 (2021), 0610019.Google Scholar
Niki Martinel, Gian Luca Foresti, and Christian Micheloni. Wide-slice residual networks for food recognition. In Proceedings of the Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, US: IEEE (2018), 567-576.Google ScholarCross Ref
Paritosh Pandey, Akella Deepthi, Bappaditya Mandal, and N. B. Puhan. FoodNet: recognizing foods using ensemble of deep networks. IEEE Signal Processing Letters 24,12 (2017), 1758-1762.Google ScholarCross Ref
Eduardo Aguilar, Marc Bolaños, and Radeva Petia. Food recognition using fusion of classifiers based on CNNs. In Proceedings of the 19th International Conference on Image Analysis and Processing. Catania, Italy: Springer (2017), 213-224.Google ScholarCross Ref
Jing Bian, Yixuan Wang, Yuhui Dai, Zezhong Chen, and Jingchun Huang. Recognition of ingredients and dish names based on convolutional neural network. Intelligent Computer and Applications 10, 6 (2020), 55-58.Google Scholar
Gang Zhang and Shiqing Zhang. Food image recognition using deep convolutional neural network and transfer learning. Research and Exploration in Laboratory 38, 6 (2019), 1006-7167.Google Scholar
Xinyue Guo, Qinhan Hu, Chunping Liu, and Jiwen Yang. Food image recognition based on transfer learning and batch normalization. Computer Applications and Software 38, 3 (2021), 124-133.Google Scholar
Weisheng Yao, Yufan Shen, Yubo Peng, and Wei Sheng. Food image classification based on self-supervised preprocessing, Intelligent Computer and Applications 11, 3 (2021), 9-15.Google Scholar
Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, DebapriyaBanerjee, and Fillia Makedon. A survey on contrastive self-supervised learning. Technologies 9, 2 (2021).Google ScholarCross Ref
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the 34th Conference on Neural Information Processing Systems. Vancouver, Canada. (2020).Google ScholarDigital Library
Hangbo Bao, Li Dong, and Furu Wei. BEiT: BERT pre-training of image transformers. In Proceedings of the 10th International Conference on Learning Representations. (2022).Google Scholar
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation. In Proceedings of the 37th International Conference on Machine Learning. (2020).Google Scholar
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16×16 words: transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations. (2021).Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas: IEEE (2016), 770–778.Google ScholarCross Ref
Ze Liu, Yutong Lin, Yue Cao, Han hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: hierarchical vision transformer using shifted windows. In Proceeding of the IEEE/CVF International Conference on Computer Vision. (2021), 10012-10022.Google ScholarCross Ref
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jegou, Julien Mairal, Piotr Nojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In Proceeding of the IEEE/CVF International Conference on Computer Vision. (2021), 9630-9640.Google ScholarCross Ref

Index Terms

Food Image Recognition Method Based on Generative Self-supervised Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition

Recommendations

Semi-supervised self-growing generative adversarial networks for image recognition
Abstract
Image recognition is an important topic in computer vision and image processing, and has been mainly addressed by supervised deep learning methods, which need a large set of labeled images to achieve promising performance. However, in most cases, ...
Read More
Occluded Facial Expression Recognition Using Self-supervised Learning
Computer Vision – ACCV 2022
Abstract
Recent studies on occluded facial expression recognition typically required fully expression-annotated facial images for training. However, it is time consuming and expensive to collect a large number of facial images with various occlusions and ...
Read More
Perceptual Image Dehazing Based on Generative Adversarial Learning
Advances in Multimedia Information Processing – PCM 2018
Abstract
Convolutional Neural Networks (CNN) based single image dehazing methods have recently gained much attention. However, as they heavily rely on synthetic haze images, existing CNN-based dehazing methods have limitations in achieving visually ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence
March 2023
824 pages
ISBN:9781450399029
DOI:10.1145/3594315

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 August 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 43
  Total Downloads
- Downloads (Last 12 months)43
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Food Image Recognition Method Based on Generative Self-supervised Learning

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Semi-supervised self-growing generative adversarial networks for image recognition

Occluded Facial Expression Recognition Using Self-supervised Learning

Perceptual Image Dehazing Based on Generative Adversarial Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Food Image Recognition Method Based on Generative Self-supervised Learning

ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence

ABSTRACT

References

Cited By

Index Terms

Recommendations

Semi-supervised self-growing generative adversarial networks for image recognition

Occluded Facial Expression Recognition Using Self-supervised Learning

Perceptual Image Dehazing Based on Generative Adversarial Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media