Does the packaging of health information affect the assessment of its reliability?

Wikipedia is frequently used as a source of health information. However, the quality of its content varies widely across articles. The DISCERN tool is a brief questionnaire developed in 1996 by the Division of Public Health and Primary Health Care of the Institute of Health Sciences of the University of Oxford. They claim it provides users with a valid and reliable way of assessing the quality of written information. However, the DISCERN instrument’s reliability in measuring the quality of online health information, particularly whether or not its scores are affected by reader biases about specific publication sources, has not yet been explored.


Introduction
The internet is a crucial source of health information for health practitioners, students and the public. In the online information landscape, Wikipedia, a widely accessible and free encyclopedia, stands out as one of the most frequently consulted sources of online health information. Despite its high frequency of usage, the health content in Wikipedia varies widely in quality. Thus, it is important for all consumers of Wikipedia's health content to consider the quality of a specific article prior to applying its content. This is specifically the aim of the DISCERN instrument, which was first developed in 1996 by the Division of Public Health and Primary Health Care of the Institute of Health Sciences of the University of Oxford. DISCERN is a brief questionnaire that provides users with a valid and reliable way of assessing the quality of written information. [1] The original tool is comprised of 15 questions that address targeted aspects of the article. The 16th question asks for the respondent's overall impression of the publication's quality. The DISCERN instrument has experienced success as a tool to measure the quality of online health information, as demonstrated by its application in 244 published studies including quality assessments of Wikipedia articles.
Nevertheless, the accuracy of DISCERN when it is used in a blinded study has not yet been evaluated. Therefore, it remains unknown whether the use of DISCERN to evaluate health information is affected by reader bias about specific publication sources. The authors hypothesize that an individual's responses to the questions in the DISCERN instrument might be influenced by their perception of the document's publisher. Specifically, that the same information will be rated differently based on where the reader perceives it to have come from. This study aims to answer the question: Does the packaging of health information affect the assessment of its reliability using the DISCERN instrument? In light of this, the authors will conduct a double-blind randomized assessment of the same information packaged as a Wikipedia article versus a BMJ literature review. The DISCERN instrument will be used by participants to conduct an assessment of the information's reliability; however, the instrument has been modified (Appendix A) to provide clear language for participants and to remove questions that relate specifically to treatments or interventions.
Participants, including physicians and medical residents, will be asked to evaluate and compare either the quality of both articles in their original formats, or both articles in inverted formats. Differences between article scoring between both groups will allow the authors to determine DISCERN's ability to overcome bias related to the article's publication source.

Literature review
The quality of Wikipedia's health content has received the vast majority of the academic attention paid to Wikipedia in the context of its use as a health information resource. The reports of Wikipedia's quality in the academic literature generally focus on Wikipedia's suitability for: patients or the general health consumer; students in health sciences; or professionals in the field of health and wellness.
To date, topics included in the assessment of Wikipedia's content for patient education or consumer health include: gastroenterology; [2] nephrology; [3] cancer; [4][5] [6][7] autoimmune disorders; [8] and medicinal drugs [9][10] [11][12] [13] or herbal supplements, [14] pathology informatics, [15] surgery, [16] [17] toxicology, [18] [19] nutrition, [20][21] [22] complementary and alternative medicine, [23] hearing loss, [24] and mental health or the brain. [25] [26] These assessments assess readability, reliability, and accuracy or completeness and specifically discuss their findings in relation to the public consumer or patient. Of those that include results -some conference proceedings do not -there is some agreement that Wikipedia is suitable for patients and a 2010 study found that, while Wikipedia is not necessarily the superior resource, it is the preferred resource. [5] There is strong evidence in the literature that students enrolled in health and medicine programs are highly likely to use or have used Wikipedia to supplement their education. Herbert, et al (2015) present evidence that suggests most medical students use Wikipedia at a moderate or high rate (67%), but this investigation reports a response rate of 21% so the findings cannot be generalized. [27] Judd and Kennedy (2011) found that medical students used Google in 69% of biomedical sessions in a computer laboratory and Wikipedia in 51% of those same sessions. [28] While the study notes an interesting trajectory whereby students' reliance on Wikipedia decreases each year from first year to third year, actual Wikipedia use remains prominent throughout students' progression through the curriculum. At Queen's University [29] and USCF, [30] Wikipedia is used in formal education as a learning tool for evidence based medicine. Overall, however, there is a lack of consensus in the literature about Wikipedia's suitability for health education. Some studies conclude Wikipedia is suitable for students [31] [32] while many conclude it is not. [33][34] [35][36] [37] A minority of evaluations of Wikipedia's health content consider its suitability for health care workers and the outcomes of these studies is also inconsistent. Park, Masupe, Joseph, et al (2016) report that Botswanan health care workers' perceptions of Wikipedia's quality is divisive at best. Further, participants in the Botswana study indicated Wikipedia's medical content as valuable simply because it is freely available and, through a now defunct relationship with telecommunications companies, remained accessible when internet access was lost. [38] However, the ability to access Wikipedia offline has been unavailable since 2018. [39] As a surgical reference, Wikipedia is found to be accurate, albeit incomplete, and an appropriate resource. [40] Conversely, the drug information on Wikipedia is deemed variable in comparison with Micromedex and, therefore, considered inappropriate drug reference for professionals. [14] Methods and design Design This is a factorial double-blind randomized controlled trial to determine if how an article is packaged affects the score it receives when the DISCERN tool is used to evaluate its reliability and quality. The study will involve four intervention arms: • Arm 1: will use DISCERN to evaluate an original BMJ article first and an original Wikipedia article second (control group A) • Arm 2: will use DISCERN to evaluate an original Wikipedia article first and an original BMJ article second (control group B) • Arm 3: will use DISCERN to evaluate a BMJ article formatted as a Wikipedia article first and a Wikipedia article formatted as a BMJ article second (experiment group A) • Arm 4: will use DISCERN to evaluate a Wikipedia article formatted as a BMJ article first and a BMJ article formatted as a Wikipedia article second (experiment group B) Controlling the order in which the articles are read as prescribed in Arms 1 and 2 and again in Arms 3 and 4, will allow the researchers to determine whether a sequence effect may have influenced the scoring of the article. The study involves four Canadian medical schools including three in Ontario and one in British Columbia allowing for recruitment of medical faculty and students possessing the relevant backgrounds of knowledge and experience to complete the study intervention. Consenting participants will be asked to attend one session, organized in their home institution, supervised by one of the co-investigators who will ensure that participants do not have access to any outside materials while completing the study intervention.

Settings
This study will be conducted on four university campuses in Ontario and British Columbia that include a medical school and that are also within reasonable proximity of the researchers' home campuses to facilitate in-person administration of participants' packets. Such institutions include:

Participants and recruitment
Participants will include faculty from the four medical institutions listed above. Participant recruitment will be done through a combination of a purposive approach, directly contacting individuals responding to inclusion criteria through e-mail or by telephone, and through study advertisement, using paper and electronic posters.
Individuals who wish to take part in the study will be required to read, complete, and sign a consent form prior to attending the supervised session. Consent forms will be stored by the co-investigators. Participants will be able to withdraw their consent at any time prior to the commencement of data analysis.

Sample size
The four medical schools included in this study report an approximate cumulative 4,770 full-and part-time faculty members (Table 1).
To achieve the desired confidence level of 90% and a margin of error of 5%, the authors will randomly select 336 participants from the pool of faculty members recruited for the study. In the event that more than 336 participants were not recruited, the authors will use a convenience sampling method until at least 336 participants have been recruited.
The estimated sample size to produce statistically significant results was calculated using the following formula:

Randomization
Each recruited participant's name and contact information and their corresponding participant ID will be kept in a separate, encrypted and password protected MS Excel spreadsheet. Once the recruitment phase is complete, the authors will use the RANDBETWEEN function in MSExcel to randomly select 336 participants. If more than 336 participants are not recruited, the authors will employ a convenience sampling method until 336 participants have been recruited.
A total of 84 participant packets will be created for each arm of the study and will be labeled with a unique number ranging from 001 until 336. An independent volunteer who is not participating in administering the study will enter numbers 001 to 336 in MSExcel. Using the RANDBETWEEN function in MSExcel, 84 numbers will be randomly selected a total of four times. Each group of 84 numbers will be assigned Arm 1, Arm 2, Arm 3 or Arm 4, respectively. The volunteer will then pack each numbered packet with the relevant documents for the arm to which they have been assigned.
Using the same encrypted and password protected MSExcel spreadsheet as above, the researchers will track the envelope numbers that are distributed during administration of the study and to whom each envelope number is assigned. This record will be used exclusively for the purpose of removing a participant's data from the study in the event they decide to withdraw their consent. Neither participants nor the researchers will have knowledge of which arm to which participants have been assigned.

Interventions
Eligible and consenting participants will be randomized into one of four arms as outlined in the study design.
Participants will be required to attend a 30 to 60 minute session supervised by study investigators (JH, DS or LR) during which they will receive their participant package. Each package will include the pre-participation survey (Appendix B), the DISCERN instrument, two articles placed in the order they should be read according to the arm to whch the envelope number has been assigned, and the post-participation questionnaire. All materials including articles and questionnaires will be collected by investigators at the end of the time allocated to completion. Following collection of the article and DISCERN questionnaire, participants will be asked to complete a short additional questionnaire inquiring about their prior knowledge of the article.

Proposed outcome measures Primary
The modified DISCERN instrument is composed of 10 questions covering the depth of content, scientific accuracy, completeness, justification or evidence given, and readability grade. Users respond to each question of the tool using a scale from 1 to 5 where 1 represents serious or extensive shortcomings while a score of 3 signifies potentially important but not serious shortcomings and a score of 5 constitutes minimal shortcomings. Final grade of the article is determined through a composite score of the 10 questions. The primary outcome of our study is the difference in scores between the original articles when compared with each other, with the difference in scores of the modified articles when compared with each other. If the difference in scores between both original articles is not significantly different from the difference in score between both modified articles, it may be concluded that DISCERN is not effective in overcoming article sourcing bias.

Secondary
The academic backgrounds and expertise of our participants may result in previous knowledge or familiarity with articles used in this study. Chances of this occurrence can not be eliminated and must be considered in the data analysis. Therefore, our secondary outcome will address the potential un-blinding of participants by using a short questionnaire (Appendix C) to determine whether subjects recognized one or both of the articles from previous readings. We will also determine whether the order of reading of the modified articles had an impact on grading by participants.

Primary assessment
A modified version of the DISCERN instrument will be used to collect data from participants ( Appendix A). All responses to each DISCERN questionnaire will be entered into SPSS and separated into four groups: BMJ as BMJ, BMJ as WP, WP as WP, WP as BMJ.
The following inferential statistical tests may be conducted with the collected data: 1. Paired t-test to determine whether the mean difference in individual DISCERN scores between results in Arms 1 and 2 are statistically different from the results in Arms 3 and 4. This test will not consider the effect that sequence may have on the DISCERN scores for each article 2. Multi-level ordinal regression using all four arms to determine whether the order in which the two articles are read by participants potentially influenced their assessment of each article.
3. One-Way ANOVA to determine whether the difference between DISCERN scores within each Arms 1 and 2 is statistically significant from Arms 3 and 4.
The following hypotheses will be tested:

Competing interests
DS and JH are Wikipedians. They research Wikipedia and contribute to its content. In an effort to minimize the risk of conflict of interest, the Wikipedia article evaluated by participants in this study was selected by the researchers because its content had no contributions from DS and minimal contributions from JH.

Author contributions
• Design of study (JH, LR, DS) • Preparation and revision of manuscript (LR, DS) • Approval of submitted manuscript (JH, LR, DS)

Ethics statement
This study has been approved by the Hamilton Integrated Research Ethics Board (HIREB) under project ID 8228.

Funding
The authors have not received funding for this study.