Evaluation of Standard Precautions Compliance Instruments: A Systematic Review Using COSMIN Methodology

Background: Standard precautions (SPs) are first-line strategies with a dual goal: to protect health care workers from occupational contamination while providing care to infected patients and to prevent/reduce health care-associated infections (HAIs). This study aimed at (1) identifying the instruments currently available for measuring healthcare professionals’ compliance with standard precautions; (2) evaluating their measurement properties; and (3) providing sound evidence for instrument selection for use by researchers, teachers, staff trainers, and clinical tutors. Methods: We carried out a systematic review to examine the psychometric properties of standard precautions self-assessment instruments in conformity with the COSMIN guidelines. The search was conducted on the databases PubMed, CINAHL, and APA PsycInfo. Results: Thirteen instruments were identified. These were classified into four categories of tools assessing: compliance with universal precautions, adherence to standard precautions, compliance with hand hygiene, and adherence to transmission-based guidelines and precautions. The psychometric properties of instruments and methodological approaches of the included studies were often not satisfactory. Only four instruments were classified as high-quality measurements. Conclusions: The available instruments that measure healthcare professionals’ compliance with standard precautions are of low-moderate quality. It is necessary that future research completes the validation processes undertaken for long-established and newly developed instruments, using higher-quality methods and estimating all psychometric properties.


Introduction
Standard precautions (SPs) are first-line strategies with a dual goal: to protect health care workers from occupational contamination while providing care to infected patients INstruments (COSMIN) [17,18], which evaluates the psychometric properties and quality of the studies according to scientific criteria.
Therefore, the aims of this review were: (1) to identify the instruments currently available for measuring healthcare professionals' compliance with standard precautions; (2) to evaluate their measurement properties; and (3) to provide sound evidence for researchers, teachers, staff trainers, and clinical tutors to use when selecting instruments.

Materials and Methods
We carried out a systematic review to examine the psychometric properties of standard precautions self-assessment instruments in conformity with the COSMIN guidelines. The systematic review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines, and the review protocol was registered to PROSPERO with ID record number CRD42023408024.

Search Strategy
The search was conducted on the databases PubMed, CINAHL, and APA PsycInfo until 1 March 2023. The PICOs formulated for the review were as follows: P-health care professionals' compliance with standard precautions; I-identification and evaluation of development studies and validation of instruments that examine the compliance to the standard precaution and evaluation of the psychometric properties of the identified tools; C-comparison of psychometric properties, instrument by instrument; O-return a GRADE of quality of instruments and recommending those with a higher GRADE score; s-tools development or evaluation. The search steps were performed according to the PRISMA statement [19].
The search strategy used the key elements of the construct of interest defined with the PICOs, as well as the search filters suggested by Terwee and colleagues [20], combining them with the AND, OR, and NOT operators. An example of the query used on PubMed is given in Appendix A. To manage the research process, we used EndNote ver. 8.2.

Inclusion and Exclusion Criteria
We included in this review articles fulfilling the following criteria: (1) articles on the development and/or psychometric validation of tools assessing healthcare professional's compliance with standard precautions; (2) articles on the cultural adaptation or linguistic validation of the instrument in another country; (3) articles published in academic and peerreviewed journals; (4) articles written in English, Italian, Peruvian, Spanish, Portuguese, and French. No limited time span was applied.
Studies that did not have as their main objective the assessment of measurement properties of the instruments that evaluated the compliance to the standard precautions (e.g., cross-sectional studies that used ad hoc surveys just to measure compliance without assessing psychometric properties) were excluded. In addition, articles that internally did not publish the tool were excluded because they needed to be evaluated by reviewers as stated in the COSMIN guidelines.

Qualitative Evaluation of Studies, Psychometric Properties, and Synthesis of Evidence for the Instruments
To conduct the data synthesis, we used the COSMIN guidelines. In recent years, these guidelines, initially developed to conduct systematic reviews of patient-reported outcome measures (PROMs), have been used to assess outcomes in healthy individuals or caregivers [21]. The evaluation and synthesis process is divided into two phases: one that evaluates and summarizes the evidence on the development and validation studies' quality and one that evaluates and summarizes the evidence on the measurement properties of the instruments.
The first phase is divided into four steps. In the first step, two reviewers independently assessed the methodological quality of each study through COSMIN Box 1, which examines the relevance of the new tool's items and the comprehensiveness and comprehensibility of the cognitive interview or the pilot study. The second step, using the COSMIN Box 2, evaluated the quality of the validation studies. This box is divided into five sections which examine relevance, comprehensiveness, and comprehensibility. Here, you can choose the sections to complete based on what was performed in the study (e.g., if the professional has not been consulted in the content validity study, sections 2d and 2e of the COSMIN Box can be skipped). In the third step, the evidence of the studies was summarized, and the tools were evaluated to determine an overall score on relevance, comprehensiveness, comprehensibility, and content validity (from sufficient to indeterminate). In the fourth step, confidence in the reliability of the overall scores (high, moderate, low, or very low) is determined using the modified Grading of Recommendations Assessment, Development, and Evaluation (GRADE) approach. According to the COSMIN 2018 guidelines, a level C is assigned when high-quality evidence for an insufficient measurement property is present, a level A rating is assigned when there is evidence for sufficient content validity and low-quality evidence for sufficient internal consistency, and a level B is assigned when the scale cannot be classified as level C or A.
The second phase is divided into a 3-steps process. In the first phase, with the COS-MIN Risk of Bias checklist, two reviewers independently evaluated the methodological quality of each study. In the second phase, each measurement property has been evaluated according to the COSMIN checklist criteria. In step three, the evidence for each instrument was summarized with a rating on its psychometric properties (from sufficient to indeterminate) and the quality of the evidence (high, moderate, low, very low) using the GRADE approach. According to this approach, recommendations can be made on the use of tools: recommended for use (A level), potentially recommended but requiring further study (B level), and not recommended for use (C level).
In order to evaluate the validity of the contents and the psychometric properties, the review team used the Excel file downloadable from the COSMIN website.

Data Extraction
Two researchers (ML and DI), during the evaluation process, extracted from the studies some data inherent to instrument title, author, year and country of publication of the study, type of study (development or validity study), sample characteristics, number of items, response system, and psychometric properties investigated. These data were used by the review team to describe the characteristics of the studies and the psychometric characteristics of the instruments (see Table 1).

Results of the Studies Includes in the Review
A total of 28 articles (12 on development and 16 on validation) containing 13 measurement tools were included in the review (see Figure 1). These studies were conducted in different continents: Asia (Hong Kong 4 studies, Iran 2 studies, China 2 studies, Turkey 2 studies, Saudi Arabia 1 study), Europe (Italy 1 study, France 1 study), Oceania (Australia 3 studies) and America (Brazil 8 studies, Bogotà 1 study, Ohio 1 study, Minneapolis 1 study, New York 1 study) ( Table 1).    The tools identified can be classified into four categories. The first category includes tools that assess adherence to Universal Precautions (UPs): Compliance with Universal Precautions (CUPs) [22,23], Knowledge and Practices Universal Precautions Scale (KPUPs) [29], and Universal Precaution scale (UPs) [26,27].
The descriptions of the studies and the instruments, with their psychometric properties, are presented in Table 1.

Methodological Quality, Overall Rating, and GRADE's Quality of Evidence
In the evaluation of the quality of evidence, seven instruments were rated Moderate (CSPS, FIASP, HAI, HHQ, QKCSP, SPQ, UPs), 1 Low (AGHPC), and 5 Very Low (ARPG, CUPs, KAPs, KPUPs, QCSP). This was determined by the quality and quantity of the validation and development studies reviewed.
Despite the low scores obtained (low and very low), the studies were not excluded from subsequent evaluations in accordance with the COSMIN guideline.
Contributing to the low scores for relevance, completeness, comprehensibility, and content validity were some biases in the design of the studies, which received mostly dubious ratings. Such ratings were assigned mainly to the instrument development procedures and pilot tests. In fact, many studies did not give clear and comprehensive information about the qualitative methodology for identifying relevant items, the presence of trained moderators or interviewers, the publication of interview guidelines in the article, the process of recording and transcribing participants' responses, the process of independent data coding, and the achievement of qualitative data saturation.
In addition, in the pilot tests, clear and comprehensive statements on the relevance, completeness, and comprehensibility of the items were not provided by the respondents. Often also, the number of people enrolled in the pilot test was as insufficient as those included in the expert panel, where sometimes it was not specified what expertise they had. In addition, for some instruments (ARPGs, KPUPs, and KAPs), only developmental studies with questionable quality ratings were included in the review, further penalizing the GRADE rating. For the GRADE evidence quality scores, see Table 2.

Psychometric Properties, Overall Rating and GRADE's Quality of the Evidence
At the stage of assessing the psychometric properties of the instruments included in the review, six instruments were rated of moderate quality (CSPS, CUPs, FIASP, HHQ, KPUPs, and SPQ) and seven instruments as low (AGHPC, ARPG, HAI, KAPs, QCSP, QKCSP, and UPs).
These scores were determined by the procedures used to test psychometric properties and were influenced by some biases. For example, low scores were assigned if, in the structural validity test, the sample size was not adequate for analysis (adequate rating: at least 5 times the number of items and ≥100 or 6 times the number of items, but <100).  Based on the psychometric properties analyzed in the studies and shown in Table 1, we were able to assess whether they met the criteria of good measurement properties given in the COSMIN guidelines.
Finally, based on the quality of the studies and the good psychometric properties of the instruments, we provided recommendations according to the modified GRADE method given in the COSMIN guidelines. An instrument that scored GRADE A had sufficient content validity (+) at any level of evidence and at least low-quality evidence for sufficient internal consistency. An instrument that scored GRADE C had to have high-quality evidence for an insufficient measurement property. GRADE B was assigned in cases GRADE A or C was not assigned. However, we considered assigning GRADE C to instruments that had been less recently developed and not further validated, had inconsistent content validity, and insufficient psychometric properties of at least a moderate degree.

Compliance Standard Precaution Instruments
There were 13 instruments included in this review; we present here a brief narrative overview of the instruments. For a complete overview of the instruments and the procedures adopted in their development and validation, see Table 1.
The Compliance with Universal Precautions scale (CUPs) originates from the Work System Model of Dejoy and colleagues [22,23], which states that compliance with universal precautions occurs at three levels: health care worker, work dynamics, and organizational context. It is a scale consisting of 11 items with a 5-point Likert response system. The total scale score ranges from 11 to 55, and high scores represent high levels of compliance with UPs. The items assessed the frequency with which workers followed specific recommendations during their work, such as, for example, proper disposal of sharps and needles, use of barrier protection (gloves, eye protection, protective clothing), and poor habits such as eating or drinking in potentially contaminated areas. Two validation studies [22,23] were included in the review; the first assessed compliance with UPs [22], and the second assessed adherence to SPs. Both used CUPs internally along with other instruments and assessed their psychometric properties. These two studies had very low and inconsistent content validity (±/VL) because the procedures for comprehensiveness and comprehensibility were not clearly described, which reported inconsistent or indeterminate ratings. Finally, the GRADE rating of this instrument is C, having also obtained an insufficient score on the psychometric properties measured.
The Handwashing Assessment Inventory (HAI) was developed by O Boyle and colleagues in 2001 [24] to assess factors that motivate health care workers to handwash. The scale is based on the Theory of Planned Behavior (TPB) [48], which assesses how individuals modify their behavior, which, in this case, consists of performing and complying with proper handwashing. The scale consists of 46 items divided into 6 sections representing the 6 principles of planned behavior in hand hygiene: beliefs about outcomes, attitudes, referent beliefs, subjective norms, control beliefs, and perceived control. The response system is a 7-point Likert. The score for each of the sections is calculated by summing the individual item scores and dividing by the number of items each participant responded to. Negatively worded items should be poured into the score before calculating the scores for each section of the HAI. Higher scores in the HAI reflect more positive motivation in hand washing. The scale development study [24] and the validation study conducted in a hospital in Bogota [25] were included in the review. The GRADE assessment of this instrument was type B because there was low evidence of insufficient psychometric properties and inconsistent content validity of a moderate degree.
The Universal Precaution scale (UPs) was developed by Chan and colleagues in 2002 [26] and was used to assess compliance with UPs by nursing students and nurses [26,27]. The items were constructed based on the UPs guideline recommended by the Hong Kong Hospital Authority in November 1988. The questionnaire consists of three parts. The first collects demographic data. The second part assesses knowledge about universal precautions and consists of 11 items with a dichotomous response system (true or false). A score of 1 is assigned to each correct answer, and the maximum possible score is 11. The higher the score, the higher the knowledge about UPs. The third part assesses compliance with universal precautions. It consists of 15 items with a 4-point Likert response system. The total scale score ranges from 0 to 33. The higher the score, the higher the compliance with UPs. The areas explored by the instrument are the use of protective equipment, sharps disposal, disposal of contaminated waste, decontamination of patient body fluids and instruments used in care, and prevention of person-to-person cross-infection. There were two studies included in the review, one developmental [26] and one validation [27]. The GRADE of the instrument is type B because the content validity is inconsistent and of moderate grade, and the evidence on psychometric properties is sufficient but of low grade.
The Attitudes Regarding Practice Guidelines (ARPG) was developed in 2004 [28] to assess obstacles to adherence to the guidelines in general and to hand hygiene. There are 18 items for the general part and 18 items for the specific part. The response system is a 6-point Likert scale. Possible scores for the two subscales range from 0 to 108, with higher scores indicating fewer perceived obstacles. In addition, the instrument asks the respondent to indicate the most important factors that facilitated or hindered guideline use and to self-report the percentage of times he or she uses a hand alcohol product. The development study, where the methodological quality was doubtful, was included in the review [28]. No other validation studies of the instrument were found. The instrument's content validity obtained an inconsistent and very low score because the procedure for attesting by experts the comprehensibility and relevance of the items were not clearly stated, and the procedure for the target group of interest was not declared. In spite of this, the GRADE of the instrument is type B for a sufficient, although low psychometric property.
The Knowledge and Practices Universal Precautions Scale (KPUPs) was developed in 2006 [29] to assess the knowledge and practice of UPs, based on the guidelines recommended by the CDC in 1987 (12 items) [49] and a questionnaire devised by Chan et al. in Hong Kong [26]. Consisting of 18 items, it is divided into two subscales: knowledge (10 items) and practices (8 items). The response system is dichotomous (True/False for Knowledge and Agree/Disagree for Practice subscales). The maximum score for knowledge is 10, and the higher it is, the higher the knowledge about UPs. The maximum score for practices is 8, and the higher it is, the higher the adherence in practice to UPs. The fields investigated are the understanding of precautions, disposal of sharps, contact with vaginal fluid, handwashing, disposal of needles and gloves, and mask and gown usage. The GRADE of the instrument is C because the content validity is inconsistent and very low, and the psychometric properties are insufficient with a moderate level of evidence.
The Knowledge, Attitudes and Practices scale (KAPs) was developed in 2008 [30] based on the literature review and expert opinion. The instrument includes demographic questions and a series of items that form subscales. The first subscale is on knowledge of SPs and TBPs with four multiple-choice questions. Correct answers receive one point; the total score therefore ranges from 0 to 4, and higher scores indicate a higher knowledge level. The second subscale consists of 11 items assessing attitudes in choosing Personal Protective Equipments (PPEs) (three items), wearing PPEs (four items), and handling high-risk procedures (four items). The response system is a five-point scale, and higher scores indicate stronger agreement on attitudes. The third subscale consists of 10 items on practices in wearing gloves (three items), wearing gowns and eye shields/goggles (four items), and following the precautionary guidelines and the contingency plan (three items). The response system is a five-point scale, and higher scores indicate greater compliance. The instrument earned a GRADE of type B because it had consistent content validity of very low grade but sufficient internal consistency of low grade.
The Hand Hygiene Questionnaire (HHQ) was developed in 2009 [31] to assess knowledge, beliefs, and practices in hand washing. It has Bandura's social cognitive theory [50] as its guiding theory. The HHQ includes three scales (36 items): a hand hygiene beliefs scale (HBS) (19 items), a hand hygiene importance scale (HIS) (3 items), and a hand hygiene practices inventory (HHPI) (14 items). The response system is a multiple choice one for HBS and a 5-point Likert one for HHPI and HIS. Three studies were included in the review, one developmental [31] and two validation [32,33]. The ratings of the studies were sufficient and moderate for relevance, comprehensiveness, comprehensibility, and overall content validity ratings. The psychometric properties of the instrument were sufficient and moderate. Therefore, the GRADE obtained from the instrument was type A.
The Questionnaires for knowledge and compliance with standard precaution (QKCSP) were developed in 2010 in China [34]. Its development originates from the guidelines of the Centers for Disease Control and Prevention (CDC) in the United States, which established the concept of standard precautions (SP) in 1996 [51]. The SP knowledge questionnaire includes 20 questions, with possible "yes", "no", or "don't know" answers. One point is added for each "yes", and the maximum possible score is 20 points. The higher the score, the greater the knowledge. The SP adherence questionnaire includes 20 questions with a 4-point Likert response system. The total possible score ranges from 0 to 80 points. The higher the score, the greater the individual's adherence to SPS. A developmental study [34] and a validation study [35] were included in the review. In the development study, unlike the validation study, the procedures that assessed the face validity of the instrument were not clearly described, producing indeterminate and moderate results in content validity (±/M). THE GRADE of the instrument is type B because the internal consistency was sufficient although low (+/L).
The Compliance with Standard Precautions Scale (CSPS) was developed in 2011 [36] by modifying the 15 items of Chan and colleagues' 2002 UPs and adding others. The final scale consists of 20 items, with a 4-point Likert response system to assess the use of protective equipment, disposal of sharps and waste, decontamination from biological fluids of used instruments, and prevention of cross-infection. Higher values indicate better compliance with SPs. The CSPS has been translated into several languages and adopted in various countries, including, but not limited to, Arabic [38], Portuguese-Brazilian [39], Italian [3], and Turkish [5]. One developmental study [36] and five validation studies that met the id inclusion citers of the review [3,5,[37][38][39] were included in the review. The scale achieved a GRADE of type A because content validity was sufficient and of moderate quality (+/M), and internal consistency was sufficient and of moderate quality (+/M).
The Questionnaire for compliance with standard precaution (QCSP) was developed in 2013 by Felix in a doctoral dissertation on a sample of 1444 Chinese nurses. It consists of twenty Likert scale questions with scores from 0 to 4 points, the total score ranges from zero to eighty points, and higher scores indicate high compliance with SPs. The development study was not included in the review because it was a doctoral thesis, but only a validation study was included [40]. The methodological quality of the validation study was very low, and face and content validity were not assessed. However, the reviewers considered the items valid from a relevance, comprehensiveness, and comprehensibility point of view. Therefore, content validity was rated as sufficient, although very low. The instrument received a GRADE of type A because the psychometric properties had sufficient scores even though they were of low quality.
The Standard Precautions Questionnaire (SPQ) was developed in 2016 in France [41] for the purpose of determining socio-cognitive factors, attitudes, behaviors, limitations, and individual and organizational constraints to SPs compliance. A development article [41] and two validation studies [41,42] were included in the review. From the literature review of existing instruments and analysis of some interviews, a 35-item questionnaire was developed with a 5-point Likert response system, later reduced to 24 items and 7 subdimensions: exemplary behavior (2 items), organizational constraints (4 items), intention to perform standard precautions (4 items), social influence (4 items), attitude toward standard precautions (3 items), facilitating organization (3 items) and individual constraints (4 items). The items are visually organized into five parts: knowledge about SPs, work environment, factors that enable compliance with SPs, factors that hinder compliance with SPs, and intention to comply with SPs. The domains explored by the instrument are prevention of infection, influence and exemplary behavior of colleagues, facilities available in a health care setting, training and reminders in the use of SP, the occurrence of unanticipated events, lack of time, heavy workload, lack of knowledge about SP, personal beliefs, problems related to use of equipment. The instrument demonstrates moderate to sufficient content validity and sufficient to moderate internal consistency such that it was assigned a GRADE score of type A.
The Factors Influencing Adherence to Standard Precautions Scale (FIASP) was developed in 2019 [44] to explore factors that may impact nurses' adherence to SPs. The FIASP scale, which in the developers' final form consists of 29 items, measures five influential factors on nurses' adherence to SPs that include leadership among colleagues, awareness of environmental stimuli for SPs implementation, an organization promoting or hindering SPs implementation, making professional judgments or evaluating situations and patients, and justifications nurses may give to justify their adherence or non-adherence to SPs. A development study [44] and a validation study [45] from FIASP were included in the review. The score assigned to content validity was affected by the unclear description of procedures to assess the relevance, comprehensiveness, and comprehensibility of items. However, internal consistency was rated as sufficient and of moderate quality, and the instrument was assigned a GRADE of type B.
The Adherence to Good Hospital Practices for COVID-19 (AGHPC) was developed in 2022 by Meneguin and colleagues [46,47] for the purpose of assessing the adherence of healthcare providers to good practices for COVID-19 in the hospital setting. The Health Belief Model, developed by U.S. psychologists in 1950, was used as the theoretical framework for developing the instrument [52]. According to this model, the adoption of preventive behavior depends on considering oneself vulnerable to a particular health problem that may affect us sooner or later (perceived susceptibility), perceiving that the health problem may have serious consequences (perceived severity), believing that the health problem can be prevented by a particular action (perceived benefits) regardless of whether that action involves negative aspects (perceived barriers). The AGHPC consists of 47 items with a 5-point Likert response system and 3 subdimensions: personal, organizational, and psychosocial. The instrument achieved sufficient content validity because the procedures to assess relevance, comprehensiveness, and comprehensibility were clearly and comprehensively described [46,47]. However, the quality of evidence is low because no validation studies were found in the literature but only the two developmental ones. However, the GRADE of the instrument is type B because the psychometric properties had poor ratings in both structural validity and internal consistency that were of low grade.

Discussion
In our systematic review, a total of 28 studies emerged that estimated the reliability and validity of 13 instruments assessing healthcare workers' SPs compliance in 13 different countries belonging to 4 continents (Asia, Europe, America, and Oceania). Most of the studies were conducted in Asia and America (23 studies). The first instrument developed was the CUPs in 1995 [22], and the last was the AGHPC in 2022 [46,47]. This means that this research topic has been covered for almost 30 years, in which there have been several modifications and changes in international knowledge and guidelines for the prevention of HAIs (Healthcare Associated Infections) [26,34].
The tools identified can be classified into four categories: tools that assess adherence to Universal Precautions (CUPs, KPUPs, and UPs), instruments that assess compliance with Standard Precautions (QKCSP, QCSP, CSPS, SPQ, and FIASP), scales that assess attitudes, behaviors, and beliefs that affect hand hygiene adherence (HAI and HHQ), tools that assess compliance with guidelines and transmission-based precautions (KAPs, ARPG, and AGHPC).
In the first category, compliance with Universal Precautions was assessed [22,26,29]. The tools were designed based on national and CDC guidelines (UPs and KPUPs) and on the Work System Model of Dejoy [22]. Of these instruments, only UPs reach the recommendation level of GRADE B, while CUPs and KPUPs achieve level C. For CUPs and KPUPs, this was because in the studies included in the review, content validity and internal consistency did not achieve sufficiency, and the quality of evidence was moderate. In contrast, for UPs, internal consistency was sufficient, but content validity was indeterminate, which did not allow it to reach a GRADE level A of recommendation.
The second category, on the other hand, includes tools designed after 2009 that assess compliance with Standard Precautions. These instruments were developed based only on the literature review (QKCSP and QCSP), on a combination of the literature review and interviews with healthcare workers (SPQ and FIASP), or on a modification of a scale that assessed UPS compliance (CSPS). Of these tools, three achieved GRADE level A because content validity and internal consistency were sufficient. Therefore, when measuring compliance with standard precautions, we recommend using QCSP, SPQ, and CSPS as instruments with higher psychometric quality.
The third category includes instruments that assess healthcare workers' compliance with hand hygiene and originate from Bandura's behavioral theories (HHQ) or planned behavior theories (HAI). Among these instruments, HHQ had a GRADE level A; therefore, we suggest its use when evaluating compliance with hand hygiene.
Finally, the last category included tools that assess compliance with additional precautions, such as the KAPs, which assess compliance with SPs and TBPs. The AGHPC assesses compliance with good practices for COVID-19, and the ARPG assesses obstacles to adherence to the guidelines in general and to hand hygiene. All three instruments achieved only a GRADE level B. Therefore, further research is needed to test in other settings the present instruments or to develop others of a better quality.
Common problems in the evaluation of studies included the challenging comparison of results from different studies that included the same instruments. This was due to the methodological quality adopted being heterogeneous and the validation studies being conducted at different times, where some analyses may not have been known at the time or may have become obsolete over time. Another problem encountered was that the internal consistency and structural validity estimated for most of the instruments were evaluated with methodological approaches of different quality, also compromising the quality of the evidence for the results. Finally, convergent validity and criterion validity were assessed on a few occasions, i.e., in instruments assessing SPs or TBs (CSPS, KAPs, and QCSP) or those assessing hand hygiene compliance (HAI and HHQ). We hypothesize that in other instruments, they were not assessed due to a lack of field knowledge and instruments that could represent the gold standard of comparison.

Limitations
One of the limitations of this review may have been that it included only peer-reviewed studies in academic journals and placed language limitations. Therefore, this may have resulted in potential publication selection bias, as other tools may have been developed and disseminated as gray literature or in different languages. In any case, we tried to learn about possible tool developments when only validation studies had been found, as in the case of the QCSP. The evaluation of the studies was based on the COSMIN 2018 guidelines, and some of the criteria required for "very good" or "adequate" evaluation may not have been considered by the authors of the older studies and thus may have influenced the final evaluation of the instruments. Finally, it was not possible to assess the responsiveness of the instruments, that is, the ability of an instrument to detect a change in the measured construct over time (as required by the COSMIN procedure) due to the absence of longitudinal studies among those included.

Conclusions
Thirteen instruments assessing healthcare workers' SPs compliance have undergone a validation process so far. Some have been developed from behavioral theories, some from literature reviews, and some have blended, revised, and integrated several already validated instruments. Not all relevant psychometric properties have been evaluated for the instruments, and often the methodological approaches used are dubious or inadequate. In addition, a lack of homogeneity in the procedures for both assessing the relevance, completeness, and comprehensibility of the instruments and assessing psychometric properties has emerged, thus threatening the external validity of the instruments. It is necessary to address future research by completing the validation processes undertaken for newly developed and already developed instruments but using higher quality methods and estimating all psychometric properties.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Multimedia: Searching Filter of PubMed
• Construct (("standard precautions" AND "infection control") or "standard precautions" or "transmission based precautions" or "droplet precautions" or "contact precautions" or "isolation precautions" or "contact isolation" or "hand hygiene" or "respiratory hygiene" or "cough hygiene" or "care of the environment" or "safe injection practices" or ppe or "personal protective equipment" or "face mask" or protection or mask or gown or gloves or "aseptic technique" or asepsis or aseptic or "non-touch technique" or "cough etiquette" or "patient placement") AND (compliance or adherence or "non compliance" or "non adherence" or non concordance).
• Population nurse or nurses or nursing or "nursing staff" or "health care professional".
• Type of instruments scale or questionnaire or assessment or measure or inventory or instrument.