Searching for cancer-related information online: Unintended retrieval of complementary and alternative medicine information

https://doi.org/10.1016/j.ijmedinf.2005.01.001Get rights and content

Summary

Purpose:

The Web is an important source of health information for consumers. Use of complementary and alternative medicine (CAM) is also increasing. Therefore, we studied the likelihood that consumers will incidentally encounter CAM information while searching the Web and the factors that influence retrieval of CAM information.

Methods:

We evaluated results retrieved by 10 cancer-related searches on six common search engines.

Results:

Of 1121 search results, 16.2% displayed CAM information. Sponsored (i.e., paid) results were more likely to display CAM information than non-sponsored results (38% versus 7.5%, p < 0.001). In Overture and Google, sponsored results accounted for 51% and 39% of results on the first page. These search engines also retrieved more CAM web pages. Search engines distinguished sponsored and non-sponsored results, but disclosure statements describing the differences were confusing. Cancer type used as the search keyword did not influence the number of CAM web pages retrieved. However, synonyms of cancer differed in their retrieval of CAM web pages (p < 0.001). Consistent with prior studies of Web search engine overlap, we found that 28% of CAM results were retrieved by two or more search engines.

Conclusions:

Clinicians should help consumers recognize sponsored results and encourage search engines to clearly explain sponsored results.

Introduction

Sixty-two percent of Web users in the USA have searched for health information online [1]. Consumers usually use general-purpose search engines without a defined search plan [2]. Although users express concern regarding the quality and accuracy of online health information, few users recall where they obtained the information to answer their health-related question [2]. In spite of growing concern about quality and accuracy of online health information, there is mounting evidence that online information affects consumer behavior [3], [4], [5]. Therefore, we studied the likelihood that consumers will incidentally encounter information regarding complementary and alternative medicine (CAM) while searching for cancer information on the Web, as well as the factors that influence the retrieval of CAM information.

In contrast to conventional medicine, CAM is controlled primarily by consumers. CAM therapies are generally not regulated by government agencies and do not require a prescription. Therefore, consumers have direct access to CAM therapies, such as dietary supplements, without the help of a licensed clinician. In addition, many patients who use CAM do not share this information with their physician [6].

CAM use is high, particularly among cancer patients. Richardson et al. found that over 80% of cancer patients surveyed at a large cancer institute had used CAM therapies [7]. Further, the type of cancer may affect CAM usage. Over 72% of breast cancer patients interviewed 2–3 years after initial surgical treatment used some form of CAM [8]. Morris et al. report that “breast cancer patients are far more likely to be consistent users compared with other tumor sites” [9].

Some patients specifically search for CAM information online. Indeed, 48% of searchers, report looking for information about CAM or experimental treatments. Those treated for a serious illness in the past year were more likely to have searched for such information [1]. Others encounter CAM information when searching for information about disease states. How such unintentional retrieval of CAM information affects patient choices is not known.

As an unregulated domain, CAM information on the Web may be particularly susceptible to poor quality. In prior work, we found consumers searching for ‘arthritis’, ‘fibromyalgia’ and ‘diabetes’ were likely to encounter commercially oriented CAM websites whose information was not supported by the conventional literature. About 70% of these CAM websites were selling products or services [10]. In another study, Ernst et al. reviewed 13 cancer-related CAM websites retrieved via CAM-specific searches, and found 38.5% of websites contained information that may harm a patient if the advice were followed [11].

Women use CAM more [12] and look online for information in greater numbers than men [3]. Therefore, searches on cancers that affect women may return more CAM web pages than searches on cancers that affect men. Specifically, we expected ‘breast cancer’ searches to retrieve more CAM web pages than searches for ‘prostate cancer’. Synonyms of cancer may also affect the retrieval of CAM web pages, as consumers likely search using lay terms rather than scientific ones.

Many search engines accept payments for websites to be prominently listed in their results. In fact, this industry is projected to be worth US$ 4 billion by 2005 [13]. Such ‘sponsored results’ are different from traditional advertising, such as banner ads, because they are generated in response to specific searches. Sponsored results are prominently displayed at the top of the results page, sometimes along with regular search results. Perhaps most concerning of all, consumers are unaware of the existence of sponsored results and do not differentiate them from non-sponsored results which are generally retrieved using an algorithm that does not factor payment for placement. In fact, 41% of links clicked by consumers are sponsored [14]. The Federal Trade Commission has recommended that search engines clearly identify sponsored results [15]. Since most CAM information is commercially driven, we hypothesized that sponsored results contain more CAM information than non-sponsored results.

Little is known about the overlap of health-related search results between search engines. High overlap suggests consumers are likely to see the same group of websites regardless of which search engine they choose. Low overlap suggests that consumers may find a diversity of information through the use of more than one search engine. In 1998, Lawrence and Giles discovered that no search engine indexed more than one-third of the Web, and using six different search engines retrieved 3.5 times more web pages than one search engine alone [16]. However, since 1998, Google introduced large databases created using crawlers with a ranking algorithm based on link-popularity. The success of Google encouraged other search engines to use similar strategies. Further, overlap between search engines in a restricted domain (cancer CAM) may be greater than 1998 estimates which were domain independent. Therefore, we sought to determine the overlap for CAM search results between search engines, and to identify those sites that appear most frequently on the first page of search results.

To summarize, we hypothesized that:

  • Consumers are likely to incidentally encounter CAM web pages while searching for cancer information on the Web.

  • Sponsored results are more likely to contain CAM information compared to non-sponsored search results.

  • The search keyword will influence the number of CAM web pages retrieved.

  • There is limited overlap between result sets retrieved by different search engines.

Section snippets

Web search

We chose six search engines: Google, Yahoo, MSN, AOL, Ask Jeeves and Overture to carry out the searches because they are the most popular search engines in terms of audience reach and time spent searching [17]. We recognized that particular search results may be duplicated due to licensing agreements between search engines. For example, Overture provides sponsored results to Yahoo. Similarly, Google provides both sponsored and non-sponsored results for AOL. However, in this study we were

Prevalence of CAM web pages

Of the 1121 web pages analyzed for our 10 cancer-related keyword searches, 182 (16.2%) were classified as CAM, while 939 (83.8%) were classified as non-CAM (Table 1). Three quarters of all CAM web pages retrieved had a commercial top-level domain (.com) (see Table 1) organizational (.org) and government domains (.gov) accounted for 10% and 7% of the CAM web pages respectively. Of the 143 educational (.edu) web pages retrieved, none contained CAM information.

Sponsored versus non-sponsored results

Of the 1121 web pages analyzed, 318

Discussion

We found that over 16% of web pages retrieved by representative searches for cancer information contain CAM information. Nearly 40% of sponsored results display CAM information. Further, 75% of all CAM web pages retrieved were hosted on a commercial website (.com top level domain). This is consistent with our prior finding that the majority of CAM web pages were commercial in nature [10]. Search engines also differ in the number of sponsored results they present. Google and Overture provide the

Acknowledgments

Supported by a training fellowship from the Keck Center for Computational and Structural Biology of the Gulf Coast Consortia (NLM Grant No. 5T15LM07093, M.W.), by a grant from the Robert Wood Johnson Foundation under its national program, Health e-Technologies (to E.V.B. and F.M.-B.) and NLM grant 5K22LM008306 (to E.V.B.).

References (29)

  • J.M. Metz et al.

    A multi-institutional study of Internet utilization by radiation oncology patients

    Int. J. Radiat. Oncol. Biol. Phys.

    (2003)
  • K.T. Morris et al.

    A comparison of complementary therapy use between breast cancer patients and patients with other primary tumor sites

    Am. J. Surg.

    (2000)
  • S. Fox et al.

    Vital Decisions: How Internet Users Decide What Information to Trust When They or Their Loved Ones are Sick

    (2002)
  • G. Eysenbach et al.

    How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews

    BMJ

    (2002)
  • S. Fox et al.

    The Health Care Revolution: How the WEB helps Americans take Better Care of Themselves

    (2000)
  • P.R. Helft et al.

    Hope and the media in advanced cancer patients

  • D.M. Eisenberg et al.

    Perceptions about complementary therapies relative to conventional therapies among adults who use both: results from a national survey

    Ann. Internal Med.

    (2001)
  • M.A. Richardson et al.

    Complementary/alternative medicine use in a comprehensive cancer center and the implications for oncology

    J. Clin. Oncol.

    (2000)
  • T. Ashikaga et al.

    Use of complimentary and alternative medicine by breast cancer patients: prevalence, patterns and communication with physicians

    Support Care Cancer

    (2002)
  • S. Sagaram et al.

    Evaluating the prevalence, content and readability of complementary and alternative medicine (CAM) web pages on the Internet

    Proc. AMIA Symp.

    (2002)
  • E. Ernst et al.

    ‘Alternative’ cancer cures via the Internet?

    Br. J. Cancer

    (2002)
  • D.M. Shumay et al.

    Determinants of the degree of complementary and alternative medicine use among patients with cancer

    J. Altern. Complement Med.

    (2002)
  • M. Hines, Yahoo to buy Overture for $ 1.63 billion, CNET news.com, 2003. http://news.com.com/2100-1030_3-1025394.html,...
  • L. Marable, False oracles: consumer reaction to learning the truth about how search engines work. Results of an...
  • Cited by (15)

    • SentiHealth-Cancer: A sentiment analysis tool to help detecting mood of patients in online social networks

      2014, International Journal of Medical Informatics
      Citation Excerpt :

      These symptoms cause significant disruption in the patient's quality of life and may have implications on treatment adherence. On the other hand, a social network can help establish positive links that contribute to treatment of the disease and is useful for exchanging experience between people with the same disease [33]. For this reason, many people who are in treatment of chronic diseases resort to groups or communities in social networks to feel better and to catch up with news about their diseases [34–39].

    • FindZebra: A search engine for rare diseases

      2013, International Journal of Medical Informatics
      Citation Excerpt :

      Among the advantages of using Google in this setting are its comprehensive index, its ease of use, and the medical personnel's familiarity with it. Its main disadvantage in the scope of clinical diagnosis is that the results contain noise, many of the results being non-relevant (e.g. pages from non-authoritative sources such as forums and personal blogs, information on alternative medicine or sponsored content [22]). The problem with general search engine algorithms in the context of clinical diagnosis is, as discussed above, that they are designed and optimised for web search.

    • Sketching a mHealth based system to improve breast cancer prevention

      2017, Pan American Health Care Exchanges, PAHCE
    View all citing articles on Scopus
    View full text