A Systematic Review of Tools to Assess Coeliac Disease-Related Knowledge

Background: Coeliac disease (CD) is an immune-mediated disorder, with dietary exclusion of gluten the only current treatment. A good knowledge of CD and gluten-free diet (GFD) is essential for those with CD to support effective self-management. Knowledge assessment with a validated tool helps evaluate understanding and knowledge gaps to better tailor educational resources. This study’s aim was to perform a systematic review to identify validated CD knowledge assessment tools. Methods: PRISMA guidelines were followed, and searches were carried out in five literature databases. Papers were reviewed for tool development and testing process and assessed against pre-defined criteria for feasibility, validity, and reliability. Results: Twenty-five papers were included in the final analysis. Studies were from 16 countries, with a range of target populations, study designs, and development processes. Eleven reported pilot testing, and five assessed readability. Content validity was assessed in ten papers and formal content validity testing in one. Many tools contained items affecting generalisability outside the region developed. Conclusions: For a CD knowledge assessment tool to be suitable for use, it needs to be well designed, tested, and generalisable. No papers identified satisfied all requirements, thus highlighting a need to develop an appropriate tool.


Introduction
Coeliac disease (CD) is an immune-mediated enteropathy that manifests in genetically susceptible individuals as a result of exposure to gluten [1].It is thought to affect around 1% of the general population worldwide, with increased prevalence seen in certain populations and families [2,3].These populations include those with concomitant autoimmune disorders, Down syndrome, and Turner syndrome, as well as those with first-and second-degree relatives with CD [2,[4][5][6][7].
To date, the only proven treatment for CD involves the strict, lifelong exclusion of sources of gluten from the diet [1].This requires avoidance of the grains wheat, barley, and rye (and their relatives), a staple of the diet in many countries [1].Many countries also recommend that oats are also excluded due to the high likelihood of cross-contamination, as well as a small subset of those with CD having an increased sensitivity to the similar proteins in some varieties of oats [8,9].Strict avoidance of gluten is essential in order to attenuate risk of disease-related complications, with even small amounts of gluten being harmful to some individuals [10].This strict avoidance can be challenging due to the risk of cross-contamination with gluten-containing foods, changes to the ingredient lists of commercial products, hidden sources of gluten in the diet, and the often-higher cost of gluten-free alternatives [11].Strict adherence to a gluten-free diet (GFD) requires continued vigilance, with proficient self-management the key to effective management of CD [12].For individuals with CD, having a sound knowledge and understanding of their condition and the importance of maintaining a GFD is essential for adherence to the diet [13,14].It is also important for the families of people with CD to understand about CD in order to support them adhering to a life-long GFD [15].
Knowledge levels of people with CD should be formally assessed in order to identify gaps in understanding and to tailor support and education to address these.This assessment should be carried out using a validated tool that is appropriate for the population to be studied.The design of the knowledge assessment tools themselves should be robust to ensure that results obtained from the tool can be confidently relied upon to develop and modify interventions.While a number of CD knowledge assessment tools have been developed and presented in the literature, their applicability to all with CD, as related to children, adults, and those in different countries, is unclear.The overall objective of this study was to undertake a systematic review to identify the different CD knowledge assessment tools that have been developed internationally.The subsequent aims were to analyse tool characteristics to determine which tools are valid, reliable, and generalisable for use with different populations with CD in the clinical and research setting.Children were also included in the target audience as knowledge of CD and the ability to self-manage is a gradual process that starts in childhood [16].Hence, this population could benefit from the use of validated knowledge assessment tools in supporting this process.

Process
This systematic review was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 (PRISMA) guidelines [17].The protocol for this systematic review was not registered with a prospective register of systematic reviews.

Eligibility Criteria
To be included in the review, papers were required to include a written assessment tool to assess knowledge related to coeliac disease.No exclusion criteria were set for population, study type, or publication language in order to ensure identification of all available survey tools.

Information Sources and Search Strategy
A search of the following databases was completed in March 2024: Medline, Embase, Cumulative Index to Nursing and Allied Health Literature (CINAHL), PsychInfo, Scopus and OpenGrey (via DANS).The search strategy included the MeSH and text word terms relating to the condition (coeliac disease), the variable to be measured (knowledge, information), the purpose of the tool (assess, measure), and how the tool may be defined (survey, questionnaire).A full breakdown of search criteria and strategies is included (Appendix A).

Selection and Data Collection Process
The title and abstract of all papers identified under the above search terms were then collated into a database and duplicates removed.This was then independently reviewed by two assessors (SH, AVR) for relevance and inclusion for a full-text review.Any disputed papers were included for full-text review.The full-text versions of the relevant articles were again reviewed independently by two assessors and categorised for inclusion or exclusion with a clear reason for any exclusions.Disputes regarding inclusion were resolved by discussion between all authors (SH, AVR, ASD).

Missing Data
In order to maximise the completeness of the data included, the authors of all papers that had not published their knowledge assessment tool questions were contacted at least twice to provide this information.If a paper included an incomplete assessment tool, the authors were contacted as above.If the full version was not provided, then the sections published were included in the review.Where papers or tools were not published in English, translations were sought from a translator with university-level qualifications in that language.

Data Items and Effect Measures
The details of the different studies were extracted from included papers and displayed in a comparative table.Characteristics included target cohort, country of origin, and study design.A comparative table was compiled looking at the characteristics of the tools identified and the inclusion of considerations relating to health feasibility, validity, reliability, and generalisability.Validated tools to assess these specific metrics are not available; therefore, we based our assessments using the following parameters.

Feasibility
Health literacy: Careful consideration should be made when selecting the type and number of questions for a tool as length and complexity can influence the quality of the data obtained.Questions should be clear and written in language understandable to the target audience.Readability: Readability assessment ensures that the questions within the tool are written at a level that the targeted reader can fully comprehend.This prevents over-estimation of knowledge deficits due to user misunderstanding of the question rather than its content.It is suggested that health information, particularly when targeting a broad range of literacy levels, be written at no higher than American 6th-to 7th-grade level [18].This can be assessed using metrics such as the Flesch-Kincaid readability test, whereby a US 6th-7thgrade level equates to a Flesch-Kincaid score of 70-90 [19].Brevity: Although it is not clear how long an assessment tool should be to optimise data quality, an association has been shown between survey length and response burden, although this must be weighed against development of a questionnaire of sufficient length to answer the intended question [20].Format: The type of response scales used may affect the amount and type of data that are obtained.For example, Likert scales tend to be used to measure linear agreement with a question, whereas dichotomous Yes/No or multiple choice questions are more commonly used to assess knowledge levels [21].

Validity and Reliability
Following on from feasibility assessment, validation of a tool is essential to ensure that the tool is robust, fit for purpose, and the results obtained are reflective of the function of the tool [22].
Face validity is the more subjective assessment of whether the survey questions accurately measure the intended overall survey question [23].This can be assessed through expert review and insights based on professional judgement and from participant feedback during pilot testing.However, it is thought to be a somewhat superficial measure of validity, hence not as useful as other forms of validation [24].Content validity refers to ensuring the inclusion of relevant questions during the design/development phase in order to accurately and comprehensively assess the target metric [25,26].Content validity during tool development is carried out by literature review and is then structured and undergoes iterative critical analysis by experts [21,26].Construct validity ensures that the survey is effectively measuring the question [23].
Reliability assessment may be two-fold.The first aspect assesses whether a tool can repeatedly achieve the same results from the same respondents, for example, a test/retest [23,24].The second aspect, internal consistency/reliability, relates to how well items in a survey correlate and measure the same general construct, for example, using Cronbach's alpha [24].Additionally, inter-rater reliability considers agreement between raters.

Generalisability
The generalisability of survey questions refers to how applicable they are to other/all populations.For example, questions could be specific to regional foods, legislation, or health services and the population that they were administered in and therefore not appropriate for a broader audience.

Synthesis Methods
After tool characteristics were reviewed, all questions included in the tools were extracted and collated into a spreadsheet as outlined above.These questions were reviewed by two assessors and then grouped into broad themes relating to their common knowledge themes, and duplicated questions were combined.From these broad themes, questions were categorised into knowledge domains.Data were then sought on the frequency different questions were asked in the various tools reviewed.

Study Selection
Initial searches identified 595 publications (Figure 1).Following removal of duplicates (n = 271) and title/abstract review, 90 of these papers met the inclusion criteria for full-text review.Two hundred and thirty-six were excluded at this stage as they did not meet inclusion criteria, (in particular, they did not include a written assessment tool to assess knowledge related to CD).Of the 90 papers that satisfied criteria for full-text review, a further 65 were subsequently excluded.Twenty papers used a tool that was not looking at the assessment of knowledge in particular, while in eighteen papers a tool could not be accessed after attempting to contact the authors on two or more occasions, or the paper was pending publication.Twenty papers used a tool that had already been identified and the original paper had already been included in the review, and five papers used a tool that was not specific to CD.The remaining 25 papers were included in the systematic review of knowledge assessment tools.

Study Charatcteristics
The 25 studies included in the final analysis were reviewed for their study characteristics (Table 1).The studies took place in 16 different countries between the years 2003 and 2024.The population sample size and study design varied considerably, from a 20-participant, single-centre study to a 3643-participant, multi-centre study.The majority of the studies were cross-sectional, with the exception of Gutowski et al. [33], Meyer et al. [40], and Vázquez-Polo et al. [47], which included longitudinal data collection, and Vernero et al. [48], who undertook an experimental study.The majority of studies (20 (80%)) looked at knowledge assessments in adults, 2 assessed parental knowledge in parents of children with CD, and 2 studies looked at knowledge in a paediatric population and in adults [30,41,42,49].Vázquez-Polo et al. [47] planned to assess knowledge of 10-12-year-olds following nutrition education intervention.Knowledge of CD and a GFD was assessed in a range of population groups, with 13 studies among those diagnosed with CD and 7 among healthcare professionals, and 3 studies looked at general public knowledge of CD management and 5 studies among food servers and providers.

Administration
The majority of the tools were self-administered (Table 2), with Garg and Gupta et al. [30] providing a self-completed survey but in an interview setting.Garipe et al. [31] delivered the tool by a trained interviewer.Roma et al. [42] also delivered their tool verbally, adjusting phraseology for age and education level of participants.Eight of the surveys were delivered electronically, with Simpson et al. [44] offering a face-to-face option as well as the electronic survey and one via post [45].

Length
The assessment tools varied in length from 5 to 85 questions (with the 85 questions presented as four question matrices).Six of the knowledge tools sat within larger surveys also looking at other factors such as adherence and attitudes towards the GFD [13,28,30,33,43,44].Four studies did not use the same tool throughout the study [33,41,44,47].Riznik et al. [41], Simpson et al. [44], and Tan et al. [45] used slightly different tools for the different target cohorts surveyed.Gutowski et al. [33] administered the same tool for all participants but a different set of questions, in this case a grocery list, at the three timepoints surveyed to assess label-reading comprehension.

Format
A range of different question formats were used in the tools (Table 2).The majority were a dichotomous or Yes/No format with some offering a 'maybe' or a 'don't know' option as well.Multiple choice questions were the next most common format with Likert and open-ended questions appearing less frequently.Broadly speaking, questions tended to focus on four main areas: the gluten content of foodstuffs; label reading or identifying gluten in the diet or in medications; CD and its pathophysiology; and assessing the practical daily management of CD and the GFD.

Readability
Readability was mentioned in five of the studies; however, only one provided formal assessment using Flesch-Kincaid readability scores [50].Howard [36], Silvester et al. [43], and Geiger et al. [32] all included feedback on readability during pretesting and piloting of the tools in peers and neighbours, Canadian Celiac Association members (n = 3), and upper-level dietetic students (n = 16), respectively.Conversely, Roma et al. [42] did not discuss readability assessment during the design phase; however, the questionnaire was delivered by investigators who 'adjusted phraseology ... as and when necessary'.

Piloting
Eleven studies reported undergoing some form of piloting of their tools prior to use.However, formal piloting was only reported in four studies, with a further six completing some form of piloting with results not reported.Garg and Gupta [30] stated that their tool was piloted as part of a previous study.

Validity and Reliability
Content Validity: Of the 25 tools included, 10 reported some form of content validity, although to varying extents.Dembi ński et al. [29] and Paganizza et al. [13] reported development of their tool through systematic review of the literature, and although Paganizza et al. [13] did consult two experts during development, neither study alluded to the use of structured critical analysis of findings.Most reported the support of experts in their field in the development of their tools through peer review, face validity, or expert analysis of the proposed tool questions, suggesting an informal review of the content.Vernero et al. [48] undertook formalised content validation through expert development of the questions and review by both the lay public and volunteers with CD and a special interest in the GFD.
Construct Validity: Only Vernero et al. [48] described undertaking a full formal validation of their tools, with Garg and Gupta [30] mentioning using a previously validated survey.Vernero et al. [48] undertook validation using discriminant validity through the assessment of response variances between two groups: those with well-controlled CD and healthy controls.
Reliability: Roma et al. [42] and Zhou et al. [50] both described formal assessment of tool reliability.The former used the Kappa statistic and test and retest percentage agreement and the latter used Cronbach's alpha to assess tool reliability.

Generalisability
Generalisability varied between the various tools.Seven tools included local foodstuffs and brands that were specific to the region where the tool was developed, such as barley squash, faggots, Prague ham, Jelly Babies, and Vegemite, and two included local legislation such as food-labelling requirements (Table 2).

Results of Synthesis
Through the systematic review process, a broad range of questions was identified for the assessment of CD knowledge.This may have been due in part to the wide range of target groups included, with knowledge assessment not limited to those with a diagnosis of CD.However, clear themes were identified in the questions asked.Most of the tools identified focused on knowledge surrounding management of a GFD, with only four studies not including an item about the identification of gluten in the diet [29,35,38,45].All of these studies looked at knowledge of healthcare professionals (HCPs).The study by Dembi ński et al. [29] was focused on knowledge among healthcare professionals of the nutritional deficiencies that people with CD may face when following a strict GFD.A general question, "My knowledge of a gluten-free diet is sufficient", was included.This assumes that the HCPs taking part in the survey were aware of the avoidances required when following a strict GFD.
Only Dembi ński [29] and Geiger [29,32], whose target audiences were HCPs, included questions regarding the nutritional concerns for those following a GFD.It is interesting that these questions were not included in more tools as they could potentially be an important knowledge requirement in effectively managing CD.
All CD knowledge assessment questions were extracted from the 17 tools included in the review.Matrix questions identifying gluten in common food products were identified as a single question.The questions were then pooled, grouped into the four main areas mentioned above, and the following nine knowledge domains emerged: 1.
General CD knowledge 2.
Management of CD 3.
Identifying gluten in the diet and as ingredients, e.g., label reading 4.
Food labelling and legislation 5.
Nutrients and a GFD 6.
Food handling practices and training 7.
Diagnosing CD The frequency of questions in each domain from the included tools was summarised (Table 3).The most common questions included related to identifying gluten in the diet, either by selecting the grains that contain gluten or through label reading.The majority of the tools included asked questions regarding knowledge of CD in general, with only two tools asking about knowledge of the nutrients at risk when following a GFD/for those with CD [29,32].

Certainty of Evidence
The studies and assessment tools included in this current review showed high levels of heterogeneity, but as the review is non-data-driven, no formal assessment could be carried out.Additionally, the focus of the review is on the survey tools included in the papers, not the study integrity itself, which leads to the certainty assessment being made according to metrics chosen to evaluate the survey tools.The objective assessment of survey tool composition, content, and format showed that most tools were lacking various important elements being considered during the development process.This limits any statement of certainty as there was no standardisation of tool development between identified studies.

Discussion
Through the systematic review process, 25 tools were identified to assess knowledge of CD and its management.A predetermined, structured process was used to identify the papers and assess their merit as effective tools.A range of different study types was included in the review.The studies were based in 16 different countries over a period of 21 years and assessed knowledge of CD in a variety of different populations, including health professionals, those with CD, food service workers, and the lay public.The main topics covered in the assessment tools included were general knowledge of CD and its management, including label reading, legislation, eating out, nutrients of concern, and potential non-food sources of gluten, as well as understanding the diagnostic processes for CD.
The purpose of this review was to find a knowledge assessment tool that could be used by healthcare professionals in the clinical and research setting to identify patient knowledge gaps about CD and the GFD that may affect their outcomes.For the tool to effectively satisfy this purpose, it needed to be well designed, tested, and generalisable to the intended population in order to reliably interpret findings when the tool is subsequently used [26].Consideration of the feasibility of assessment tools, as related to health literacy, readability, and respondent burden, is of paramount importance for both adults and children with CD.However, no formal feasibility testing was carried out in any study included in the review.
Health literacy is considered to be the ability to understand, interpret, and act on health information [51,52], and reports have shown health literacy levels to be suboptimal both at school-age and continuing on in to adulthood [53,54].Hence, the language used in tools designed for the general public should reflect this in order to optimise understanding and participation [55], especially where children are included in the target cohort [56].
The language used in assessment tools should be of a readability level suitable for the target population and of a length that answers the research question sufficiently while not placing undue burden on the user [20,57].While assessment tools for clinical conditions such as CD may involve some complex clinical terms and jargon, overall, a tool should not have readability assessments higher than a US grade 7-level comprehension [18].Davis et al. [58] showed that level of education completed can in fact be an inaccurate estimate of reading level.They found the self-reported education level of parents surveyed in an outpatient department being up to a grade-11 level.However, when assessed, reading levels were found to be on average four grade levels lower than this, at around a US 7th-to 8th-grade level.Hence, they found that many health information resources being provided poorly correlated with actual reading level, with only 3% of resources reviewed being written below a 7th-grade level [58].Previous work has tried to alleviate the respondent burden of complex terms in paediatric health forms by the use of pictures, but this was not a simple fix as children still considered them as needing explanations [59].The known positive association between higher knowledge levels and greater adherence to GFD among both adults and children highlights how important the use of an appropriate assessment tool is to identify gaps or misconceptions in understanding [15,27,60].In particular, adherence to a GFD in childhood has also been associated with better growth and quality of life, so inclusion of children in the target population of such assessment tools is crucial [61].
Following the development of assessment tools, undertaking a pilot study allows for testing viability prior to embarking on a full validation study [62].This enables minor issues with the tool to be resolved prior to commencement and increases the likelihood of success [62].Carrying out full validity testing then determines whether the measurement tool used actually measures the proposed research concept and can quantify the variables with stable or consistent responses [63].Although a number of the studies in this current review reported informal pilot testing, no results were reported, and few carried out full validity testing.This thereby limits researcher or clinician confidence in using these published tools.In addition, many of the tools in the review had issues with generalisability, thereby further limiting their applicability.
For an intervention to be of use outside the setting where it was originally developed/evaluated, it must be generalisable to other regions, populations, and clinical settings [64].In this review, generalisability was limited by the inclusion of questions regarding local food labelling and legislation, support groups, and the inclusion of regional foodstuffs, all of which may vary significantly in different countries.The cut-offs for allowable gluten traces in foods labelled gluten-free, for example, are different in different parts of the world [65], as well as requirements for identifying gluten and potential contamination in packaged goods [10].A local coeliac society was mentioned in one of the tools.Membership in a society has been associated with better adherence to a GFD, but this resource is not available in all countries and access and format may vary [66,67].Foodstuffs mentioned in some tools may not be available internationally and may, therefore, be unfamiliar to some target audiences.Consideration in the context of maintaining a GFD must be given to possible variations in recipes of commercial foodstuffs, as well as the possibility of gluten cross-contamination in different countries [68].
No papers included in the review reported all of the pre-determined, necessary feasibility, validity/reliability, and generalisability metrics that would make it a robust and fit-for-purpose tool for the intended populations with CD in the clinical and research setting.The tool that most closely met the requirements for the study was from Vernero et al. [48], which presented a robust tool whose development included piloting and validation.The tool itself covered a broad range of knowledge question areas, covering six of the nine question domains that were developed in the process of data synthesis.Unfortunately, no readability assessments were described, and the tool was not generalisable to other populations due to the use of questions regarding EU-specific legislation and the inclusion of a number of local foodstuffs.

Limitations
Eighteen of the papers that met the basic inclusion criteria were not included as the assessment tool could not be obtained despite repeated attempts to contact authors.Inclusion of these tools would have doubled the number that could be reviewed for inclusion, and their exclusion presents a risk of bias.The lack of standardisation in the format of identified surveys forms a large part of the discussion in this current work; however, it is in itself a study limitation.Inferences as to the best format and content for a coeliac disease knowledge assessment tool are restricted due to their heterogeneity, as well as by a lack of validated assessment tools with which to assess them.Evidence-based metrics were chosen to assess survey tools, but the use of a validated, comprehensive assessment strategy would have enabled more objective summaries and inferences, as well as future comparisons.

Strengths
A robust, transparent review process was undertaken to identify tools used to assess knowledge of CD and its management internationally.Substantial efforts were made to contact the authors of papers where the tool itself was not published in the paper.The tools identified and included in the review were from a variety of centres and with varying target populations.Content synthesis of the included assessment tools allowed for generation of clear themes that may be utilised in future assessment tool development.

Conclusions
The maintenance of a strict GFD is key to effective management of CD, with better knowledge supporting autonomy in self-management and adherence [69].However, there are wide variations in practice for the education provided to people with CD and their wider communities [70].Hence, assessing knowledge of CD and its management and identifying any gaps would be essential for planning and development of future education resources.The results of this current review will inform further research to address the knowledge gap of having no suitable CD knowledge assessment tool available in the literature.

Figure 1 .
Figure 1.PRISMA 2020 flow diagram of the article selection process.

Figure 1 .
Figure 1.PRISMA 2020 flow diagram of the article selection process.

Table 1 .
Summary of the characteristics of studies and their participants.

Table 2 .
Knowledge assessment tool characteristics, design and testing.