Skip to main content
Log in

Context based image analysis with application in dietary assessment and evaluation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Dietary assessment is essential for understanding the link between diet and health. We develop a context based image analysis system for dietary assessment to automatically segment, identify and quantify food items from images. In this paper, we describe image segmentation and object classification methods used in our system to detect and identify food items. We then use context information to refine the classification results. We define contextual dietary information as the data that is not directly produced by the visual appearance of an object in the image, but yields information about a user’s diet or can be used for diet planning. We integrate contextual dietary information that a user supplies to the system either explicitly or implicitly to correct potential misclassifications. We evaluate our models using food image datasets collected during dietary assessment studies from natural eating events.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Agarwal S, Snavely N, Simon I, Seitz SM, Szeliski R (2009) Building rome in a day. In: Proceedings of the IEEE international conference on computer vision. Elsevier, Kyoto, pp 72–79

  2. Argus. http://www.azumio.com/s/argus/index.html

  3. Amadasun M, King R (1989) Textural features corresponding to textural properties. IEEE Trans Syst Man Cybern 19(5):1264–1274

    Article  Google Scholar 

  4. Arulampalam S, Maskell S, Gordon N, Clapp T (2002) A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans Signal Process 50(2):174–188

    Article  Google Scholar 

  5. Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (SURF). Journal of Computer Vision and Image Understanding 110(3):346–359

    Article  Google Scholar 

  6. Biederman I, Mezzanotte R, Rabinowitz J (1982) Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol 14(2):143–177

    Article  Google Scholar 

  7. Bossard L, Guillaumin M, Van Gool L (2014) Food-101 – mining discriminative components with random forests. European Conference on Computer Vision 8694:446–461

    Google Scholar 

  8. Boushey CJ, Kerr DA, Wright J, Lutes KD, Ebert DS, Delp EJ (2009) Use of technology in children’s dietary assessment. Eur J Clin Nutr 63:S50–S57

    Article  Google Scholar 

  9. Choi T, Chin S (2013) An intelligent wellness keeper for food nutrition with graphical icons. International Journal of Multimedia and Ubiquitous Engineering 8:207–214

    Google Scholar 

  10. Deng Y, Manjunath BS, Kenney C, Moore MS, Shin H (2001) An efficient color representation for image retrieval. IEEE Trans Image Process 10:140–147

    Article  MATH  Google Scholar 

  11. Diet Camera. http://www.dietcamera.com/

  12. Duda R, Hart P (1973) Pattern classification and scene analysis. Wiley, Hoboken

    MATH  Google Scholar 

  13. Fang S, Liu C, Zhu F, Delp E, Boushey C (2015) Single-view food portion estimation based on geometric models. In: Proceedings of the IEEE international symposium on multimedia. Elsevier, Miami, pp 385–390

  14. Felzenszwalb P, Huttenlocher D (1998) Image segmentation using local variation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Santa Barbara, pp 98–104

  15. Galleguillos C, Belongie S (2010) Context based object categorization: a critical survey. Comput Vis Image Underst 114:712–722

    Article  Google Scholar 

  16. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Columbus, pp 580–587

  17. He Y, Khanna N, Boushey C, Delp E (2013) Image segmentation for image-based dietary assessment: A comparative study. In: Proceedings of the IEEE international symposium on signals, circuits and systems. Springer, Iasi, pp 1–4

  18. He Y, Xu C, Khanna N, Boushey C, Delp E (2014) Analysis of food images: Features and classification. In: Proceedings of the IEEE international conference on image processing. IEEE, Paris, pp 2744–2748

  19. Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233

    Article  MATH  Google Scholar 

  20. Joutou T, Yanai K (2009) A food image recognition system with multiple kernel learning. In: Proceedings of the IEEE international conference on image processing. Springer, Cairo, pp 285–288

  21. Julesz B (1981) Textons, the elements of texture perception and their iteractions. Nature 290:91–97

    Article  Google Scholar 

  22. Kass M, Witkin A, Terzopoulos D (1988) Snakes: active contour models. International journal Of Computer Vision 1(4):321–331

    Article  MATH  Google Scholar 

  23. Kelkar S, Stella S, Okos M (2010) X-Ray micro computed tomography (CT): a novel method to measure density of porous food. In: Proceedings of the IFT annual meeting and food expo. ACM, Chicago

  24. Kenney C, Deng Y, Manjunath BS, Hewer G (2001) Peer group image enhancement. IEEE Trans Image Process 10:326–334

    Article  MathSciNet  MATH  Google Scholar 

  25. Kitamura K, Yamasaki T, Aizawa K (2009) Foodlog: capture, analysis and retrieval of personal food images via web. In: Proceedings of the ACM multimedia workshop on Multimedia for cooking and eating activities. MIT Press, Beijing, pp 23–30

  26. Kong F, He H, Raynor HA, Tan J (2015) Dietcam: multi-view regular shape food recognition with a camera phone. Pervasive Mob Comput 19:108–121

    Article  Google Scholar 

  27. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of advances in neural information processing systems, pp 1097–1105

  28. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  Google Scholar 

  29. Livingstone MBE, Robson PJ, Wallace JMW (2004) Issues in dietary intake assessment of children and adolescents. Br J Nutr 92:S213–S222

    Article  Google Scholar 

  30. Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2(60):91–110

    Article  Google Scholar 

  31. Ma WY, Deng Y, Manjunath B (1997) Tools for texture- and color-based search of images. In: Proceedings of the SPIE human vision and electronic imaging II 3016, San Jose, pp 496–507

  32. Manjunath B, Ohm JR, Vasudevan V, Yamada A (2001) Color and texture descriptors. IEEE Trans Circuits Syst Video Technol 11(6):703–715

    Article  Google Scholar 

  33. Martinel N, Foresti GL, Micheloni C (2016) Wide-slice residual networks for food recognition. arXiv:1612.06543

  34. McFee B, Galleguillos C, Lanckriet G (2011) Contextual object localization with multiple kernel nearest-neighbor. IEEE Trans Image Process 20(2):570–585

    Article  MathSciNet  MATH  Google Scholar 

  35. Meyers A, Johnston N, Rathod V, Korattikara A, Gorban A, Silberman N, Guadarrama S, Papandreou G, Huang J, Murphy KP (2015) Im2calories: Towards an automated mobile vision food diary. In: Proceedings of the IEEE international conference on computer vision. MIT Press, Santiago, pp 1233–1241

  36. Mikolajczyk K, Schmid C (2004) Scale and affine invariant interest point detectors. Int J Comput Vis 1(60):63–86

    Article  Google Scholar 

  37. Murphy K, Torralba A, Freeman W (2003) Using the forest to see the trees: a graphical model relating features, objects and scenes. Adv Neural Inf Proces Syst 16:1499–1506

    Google Scholar 

  38. National Vital Statistics System U.S. (2009) Quickstats: age-adjusted death rates for the 10 leading causes of death. Morb Mortal Wkly Rep 58(46):1303

    Google Scholar 

  39. Nister D, Stewenius H (2006) Scalable recognition with a vocabulary tree. In: Proceedings of the IEEE conference on computer vision and pattern recognition. Elsevier, Washington, pp 2161–2168

  40. Oliva A, Torralba A (2007) The role of context in object recognition. Trends Cogn Sci 11(12):520– 527

    Article  Google Scholar 

  41. Peddi SVB, Kuhad P, Yassine A, Pouladzadeh P, Shirmohammadi S, Shirehjini AAN (2017) An intelligent cloud-based data processing broker for mobile e-health multimedia applications. Futur Gener Comput Syst 66:71–86

    Article  Google Scholar 

  42. Philbin J, Chum O, Isard M, Sivic J, Zisserman A (2007) Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, pp 1–8

  43. Pytorch. http://www.pytorch.org/. Tensors and Dynamic neural networks in Python with strong GPU acceleration

  44. Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S (2007) Objects in context. In: Proceedings of the IEEE international conference on computer vision. IEEE, Rio de Janeiro, pp 1–8

  45. Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. Cambridge University Press, Columbus, pp 806–813

  46. Sarkka S (2013) Bayesian filtering and smoothing. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  47. Schap T, Zhu F, Delp E, Boushey C (2014) Merging dietary assessment with the adolescent lifestyle. J Hum Nutr Diet 27(s1):82–88

    Article  Google Scholar 

  48. Schindler G, Brown M, Szeliski R (2007) City-scale location recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Minneapolis, pp 1–7

  49. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  50. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  51. Stella S, Kelkar S, Okos M (2010) Predicting and 3D laser scanning for determination of apparent density of porous food. In: Proceedings of the IFT annual meeting and food expo. Elsevier, Chicago

  52. Thompson FE, Subar AF, Loria CM, Reedy JL, Baranowski T (2010) Need for technological innovation in dietary assessment. J Am Diet Assoc 110(1):48–51

    Article  Google Scholar 

  53. Tola E, Lepetit V, Fua P (2010) DAISY: an efficient dense descriptor applied to wide baseline stereo. IEEE Trans Pattern Anal Mach Intell 32(5):815–830

    Article  Google Scholar 

  54. Torralba A, Murphy KP, Freeman WT, Rubin MA (2003) Context-based vision system for place and object recognition. In: Proceedings of the IEEE international conference on computer vision, Nice, pp 273–280

  55. Tuingle. http://tuingle.com/

  56. Uijlings JR, van de Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  57. Wang X, Yang M, Cour T, Zhu S, Yu K, Han TX (2011) Contextual weighting for vocabulary tree based image retrieval. In: Proceedings of the IEEE international conference on computer vision, pp 209–216

  58. Wang Y, He Y, Zhu F, Boushey C, Delp E (2015) The use of temporal information in food image analysis. In: Murino V, Puppo E, Sona D, Cristani M, Sansone C (eds) New Trends in image analysis and processing – ICIAP 2015 workshops, lecture notes in computer science, vol 9281. Springer International, Berlin, pp 317–325

  59. Zhu F, Bosch M, Woo I, Kim S, Boushey C, Ebert D, Delp E (2010) The use of mobile devices in aiding dietary assessment and evaluation. IEEE J Sel Top Sign Proces 4(4):756–766

    Article  Google Scholar 

  60. Zhu F, Bosch M, Khanna N, Boushey C, Delp E (2015) Multiple hypotheses image segmentation and classification with application to dietary assessment. IEEE journal of Biomedical and Health Informatics 19(1):377–388

    Article  Google Scholar 

Download references

Acknowledgements

This work was sponsored by the US National Institutes of Health under grant NIH/NCI 1U01CA130784-01 and NIH/NIDDK 2R56DK073711-04,1R01-DK073711-01A1. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the US National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ye He.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., He, Y., Boushey, C.J. et al. Context based image analysis with application in dietary assessment and evaluation. Multimed Tools Appl 77, 19769–19794 (2018). https://doi.org/10.1007/s11042-017-5346-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5346-x

Keywords

Navigation