skip to main content
10.1145/3589335.3651451acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
short-paper
Free Access

Health CLIP: Depression Rate Prediction Using Health Related Features in Satellite and Street View Images

Published:13 May 2024Publication History

ABSTRACT

Mental health is a state of mental well-being that enables people to cope with the stresses of life, realize their abilities, learn well and work well, and contribute to their community. It has intrinsic and instrumental value and is integral to our well-being, and its correlation with environmental factors has been a subject of growing interest. As the pressure of society keeps growing, depression has become a severe problem in modern cities, and finding a way to estimate depression rate is of significance to relieve the problem. In this study, we introduce a Contrastive Language-Image Pretraining (CLIP) based novel approach to predict mental health indicators, especially depression rate, through satellite and street view images. Our methodology uses state-of-the-art Multimodal Large Language Model (MLLM), GPT4-vision, to generate health related captions for satellite and street view images, then we use the generated image-text pairs to fine-tune the CLIP model, making its image encoder extract health related features such as green spaces, sports fields, and infrastructral characteristics. The fine-tuning process is employed to bridge the semantic gap between textual descriptions and visual representations, enabling a comprehensive analysis of geo-tagged images. Consequently, our methodology achieves a notable R2 value of 0.565 on prediction of depression rate in New York City with the combination of satellite and street view images. The successful deployment of Health CLIP in a real-world scenario underscores the practical applicability of our approach.

Skip Supplemental Material Section

Supplemental Material

hdp2880.mp4

Supplemental video

mp4

42.5 MB

References

  1. Amanda J Baxter, George Patton, Kate M Scott, Louisa Degenhardt, and Harvey A Whiteford. 2013. Global epidemiology of mental disorders: what are we missing? PloS one, Vol. 8, 6 (2013), e65514.Google ScholarGoogle ScholarCross RefCross Ref
  2. Centers for Disease Control and Prevention (CDC). 2013. Web-based Injury Statistics Query and Reporting System (WISQARS). National Center for Injury Prevention and Control, CDC (producer).Google ScholarGoogle Scholar
  3. Sumit Chopra, Raia Hadsell, and Yann LeCun. 2005. Learning a similarity metric discriminatively, with application to face verification. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), Vol. 1. IEEE, 539--546.Google ScholarGoogle Scholar
  4. Dominic B Dwyer, Peter Falkai, and Nikolaos Koutsouleris. 2018. Machine learning approaches for clinical psychology and psychiatry. Annual review of clinical psychology , Vol. 14 (2018), 91--118.Google ScholarGoogle Scholar
  5. Alec Radford et al. 2021. Learning Transferable Visual Models From Natural Language Supervision. CoRR , Vol. abs/2103.00020 (2021). showeprint[arXiv]2103.00020Google ScholarGoogle Scholar
  6. HealthData.gov. [n.,d.]. PLACES: Local Data for Better Health - Census Tract Data. https://healthdata.gov/dataset/PLACES-Local-Data-for-Better-Health-Census-Tract-D/jpdw-4rwm/about_data.Google ScholarGoogle Scholar
  7. Ronald C Kessler, Wai Tat Chiu, Olga Demler, and Ellen E Walters. 2005. Prevalence, severity, and comorbidity of 12-month DSM-IV disorders in the National Comorbidity Survey Replication. Archives of general psychiatry, Vol. 62, 6 (2005), 617--627.Google ScholarGoogle Scholar
  8. Jihyeon Lee, Dylan Grosz, Burak Uzkent, Sicheng Zeng, Marshall Burke, David Lobell, and Stefano Ermon. 2021. Predicting livelihood indicators from community-generated street-level imagery. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 268--276.Google ScholarGoogle ScholarCross RefCross Ref
  9. Fan Liu, Delong Chen, Zhangqingyun Guan, Xiaocong Zhou, Jiale Zhu, and Jun Zhou. 2023. RemoteCLIP: A Vision Language Foundation Model for Remote Sensing. arxiv: 2306.11029 [cs.CV]Google ScholarGoogle Scholar
  10. Wei Qin, Zetong Chen, Lei Wang, Yunshi Lan, Weijieying Ren, and Richang Hong. 2023. Read, Diagnose and Chat: Towards Explainable and Interactive LLMs-Augmented Depression Detection in Social Media. arxiv: 2305.05138 [cs.CL]Google ScholarGoogle Scholar
  11. Andrew G Reece and Christopher M Danforth. 2017. Instagram photos reveal predictive markers of depression. EPJ Data Science, 6 (15), 1--12.Google ScholarGoogle Scholar
  12. Theo et al. Vos. 2016. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990--2015: a systematic analysis for the Global Burden of Disease Study 2015. The lancet, Vol. 388, 10053 (2016), 1545--1602.Google ScholarGoogle Scholar
  13. Zhecheng Wang, Haoyuan Li, and Ram Rajagopal. 2020. Urban2Vec: Incorporating Street View Imagery and POIs for Multi-Modal Urban Neighborhood Embedding. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 01 (Apr. 2020), 1013--1020.Google ScholarGoogle ScholarCross RefCross Ref
  14. World Health Organization. [n.,d.]. Mental health. https://www.who.int/health-topics/mental-health#tab=tab_1Google ScholarGoogle Scholar
  15. Xuhai Xu, Bingshen Yao, Yuanzhe Dong, Hong Yu, James Hendler, Anind K Dey, and Dakuo Wang. 2023. Leveraging large language models for mental health prediction via online text data. arXiv preprint arXiv:2307.14385 (2023).Google ScholarGoogle Scholar

Index Terms

  1. Health CLIP: Depression Rate Prediction Using Health Related Features in Satellite and Street View Images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        WWW '24: Companion Proceedings of the ACM on Web Conference 2024
        May 2024
        1928 pages
        ISBN:9798400701726
        DOI:10.1145/3589335

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 May 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        Overall Acceptance Rate1,899of8,196submissions,23%
      • Article Metrics

        • Downloads (Last 12 months)35
        • Downloads (Last 6 weeks)35

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader