skip to main content
10.1145/3594315.3594644acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccaiConference Proceedingsconference-collections
research-article

Food Image Recognition Method Based on Generative Self-supervised Learning

Authors Info & Claims
Published:02 August 2023Publication History

ABSTRACT

The demand of social life for automatic recognition of food images is increasing. Food images have the characteristics of diverse forms, small differences between classes and large differences within classes, which has the problem of high recognition difficulty. This paper proposes a food image recognition method based on generative self-supervised learning. Firstly, we use a BEiT based pre-training model which is trained through generative self-monitoring learning method as the feature extraction network to extract the global semantics and local detail features of food images. And then we fine-tune the fully connected network MLP for classification and recognition through supervised learning method. The model is tested on the current mainstream public food image dataset Food-101, and the top-1 accuracy of 85.99% is obtained. The experimental results show that this method can significantly reduce the computation of pixel level expression as well as extract the global and detailed features of the image, achieving quite good food image classification and recognition effect. Our method has good robustness, generalization and flexibility, which has practical application value.

References

  1. Hongsheng He, Fangyu Kong, and Jindong Tan. DietCam: multiview food recognition using a multikernel SVM. IEEE Journal of Biomedical and Health Informatics 20, 3 (2016), 848-855.Google ScholarGoogle ScholarCross RefCross Ref
  2. Mei-yun Chen, Yung-hsiangYang, Chia-Ju Ho, Shih-Han Wang, Shane-Ming Liu, Eugene Chang, Che-Hua Yeh, and Ming Ouhyoung. Automatic Chinese food identification and quantity estimation. In Proceedings of SA' 12 SIGGRAPH Asia 2012 Technical Briefs. Singapore: ACM (2012).Google ScholarGoogle Scholar
  3. Niki Martinel, Claudio Piciarelli, and Christian Micheloni. A supervised extreme learning committee for food recognition. Computer Vision and Image Understanding 148 (2016), 67-86.Google ScholarGoogle ScholarCross RefCross Ref
  4. Huagang Liang, Xiaoqian Wen, Dandan Liang, Huaide Li, and Feng Ru. Fine-grained food image recognition of a multi-level convolution feature pyramid. Journal of Image and Graphics, 2019, 24(06): 0870-0881.Google ScholarGoogle Scholar
  5. Zhiliang Deng and Lei Li. Chinese food recognition model based on improved residual network. Progress in Laser and Optoelectronics 58, 6 (2021), 0610019.Google ScholarGoogle Scholar
  6. Niki Martinel, Gian Luca Foresti, and Christian Micheloni. Wide-slice residual networks for food recognition. In Proceedings of the Winter Conference on Applications of Computer Vision. Lake Tahoe, NV, US: IEEE (2018), 567-576.Google ScholarGoogle ScholarCross RefCross Ref
  7. Paritosh Pandey, Akella Deepthi, Bappaditya Mandal, and N. B. Puhan. FoodNet: recognizing foods using ensemble of deep networks. IEEE Signal Processing Letters 24,12 (2017), 1758-1762.Google ScholarGoogle ScholarCross RefCross Ref
  8. Eduardo Aguilar, Marc Bolaños, and Radeva Petia. Food recognition using fusion of classifiers based on CNNs. In Proceedings of the 19th International Conference on Image Analysis and Processing. Catania, Italy: Springer (2017), 213-224.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jing Bian, Yixuan Wang, Yuhui Dai, Zezhong Chen, and Jingchun Huang. Recognition of ingredients and dish names based on convolutional neural network. Intelligent Computer and Applications 10, 6 (2020), 55-58.Google ScholarGoogle Scholar
  10. Gang Zhang and Shiqing Zhang. Food image recognition using deep convolutional neural network and transfer learning. Research and Exploration in Laboratory 38, 6 (2019), 1006-7167.Google ScholarGoogle Scholar
  11. Xinyue Guo, Qinhan Hu, Chunping Liu, and Jiwen Yang. Food image recognition based on transfer learning and batch normalization. Computer Applications and Software 38, 3 (2021), 124-133.Google ScholarGoogle Scholar
  12. Weisheng Yao, Yufan Shen, Yubo Peng, and Wei Sheng. Food image classification based on self-supervised preprocessing, Intelligent Computer and Applications 11, 3 (2021), 9-15.Google ScholarGoogle Scholar
  13. Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, DebapriyaBanerjee, and Fillia Makedon. A survey on contrastive self-supervised learning. Technologies 9, 2 (2021).Google ScholarGoogle ScholarCross RefCross Ref
  14. Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the 34th Conference on Neural Information Processing Systems. Vancouver, Canada. (2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hangbo Bao, Li Dong, and Furu Wei. BEiT: BERT pre-training of image transformers. In Proceedings of the 10th International Conference on Learning Representations. (2022).Google ScholarGoogle Scholar
  16. Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation. In Proceedings of the 37th International Conference on Machine Learning. (2020).Google ScholarGoogle Scholar
  17. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16×16 words: transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations. (2021).Google ScholarGoogle Scholar
  18. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas: IEEE (2016), 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ze Liu, Yutong Lin, Yue Cao, Han hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: hierarchical vision transformer using shifted windows. In Proceeding of the IEEE/CVF International Conference on Computer Vision. (2021), 10012-10022.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jegou, Julien Mairal, Piotr Nojanowski, and Armand Joulin. Emerging properties in self-supervised vision transformers. In Proceeding of the IEEE/CVF International Conference on Computer Vision. (2021), 9630-9640.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Food Image Recognition Method Based on Generative Self-supervised Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCAI '23: Proceedings of the 2023 9th International Conference on Computing and Artificial Intelligence
      March 2023
      824 pages
      ISBN:9781450399029
      DOI:10.1145/3594315

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)43
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format