Device errors in asthma and COPD: systematic literature review and meta-analysis

Inhaler device errors are common and may impact the effectiveness of the delivered drug. There is a paucity of up-to-date systematic reviews (SRs) or meta-analyses (MAs) of device errors in asthma and chronic obstructive pulmonary disease (COPD) patients. This SR and MA provides an estimate of overall error rates (both critical and non-critical) by device type and evaluates factors associated with inhaler misuse. The following databases from inception to July 23, 2014 (Embase®, MEDLINE®, MEDLINE® In-Process and CENTRAL) were searched, using predefined search terms. Studies in adult males and females with asthma or COPD, reporting at least one overall or critical error, using metered dose inhalers and dry powder inhalers were included. Random-effect MAs were performed to estimate device error rates and to compare pairs of devices. Overall and critical error rates were high across all devices, ranging from 50–100% and 14–92%, respectively. However, between-study heterogeneity was also generally >90% (I-squared statistic), indicating large variability between studies. A trend towards higher error rates with assessments comprising a larger number of steps was observed; however no consistent pattern was identified. This SR and MA highlights the relatively limited body of evidence assessing device errors and the lack of standardised checklists. There is currently insufficient evidence to determine differences in error rates between different inhaler devices and their impact on clinical outcomes. A key step in improving our knowledge on this topic would be the development of standardised checklists for each device.


INTRODUCTION
Inhaled medications are fundamental to the treatment of asthma and chronic obstructive pulmonary disease (COPD), with inhaler devices being the principal route for administering such treatments. 1,2 Many different types of inhaler are available, but pressurized metered dose inhalers (pMDIs) and dry powder inhalers (DPIs) are the devices most commonly used for drug delivery in the treatment of asthma and COPD. 1, 2 A large number of asthma and COPD patients do not use their inhaler devices correctly. Errors in device use may impact the effectiveness of the delivered drug and thereby lead to the sub-optimal control of asthma and COPD. [3][4][5][6] It is therefore important to understand and quantify device-use errors so that patient interventions can be effectively introduced and new devices designed to avoid common errors.
The literature highlights the fact that the definitions of critical and non-critical errors, as well as the number and type of checklist steps, vary widely between different devices and studies. A critical error is one that may impact the effectiveness of the delivered drug and thereby lead to the sub-optimal disease control of asthma and COPD, 3,7 whereas a non-critical error is one of the checklist steps for a particular device that is not classified as critical.
A previous systematic review (SR) focused on errors with DPI devices only 7 and another has found that there has been no change in the type and number of errors reported over the past 40 years. 8 The present SR and MA was conducted to provide an estimate of error rates (i.e., the proportion of patients with at least one error, critical and/or non-critical) by device type and to evaluate the factors associated with inhaler misuse, for example device and patient characteristics. In addition, the use of educational interventions designed to improve inhaler technique was investigated.

Search results
The search results for the SR are shown in the PRISMA flow diagram (Fig. 1). Overall, a total of 2519 citations were identified via database searching and a further 18 were identified through back-referencing of reviews and other relevant primary studies. After screening, 72 primary studies were extracted, all of which were included in the SR and 40 of these were selected for inclusion in the MA, based on the predefined criteria previously described (Fig. 1). Reasons for study exclusion are provided as Supplemental data (Appendix 1).

Study characteristics
The majority (54%) of the 72 identified studies included in the SR comprised patients with both asthma and COPD, while 32% and 14% were conducted in either asthma or COPD patients, respectively. Most (90%) of the mixed patient studies did not report data for the asthma or COPD subgroups separately.
Fifty-four observational studies were retrieved. Of these, 40 were of a cross-sectional design and the remaining studies were prospective. Only 18 randomised controlled trials (RCTs) were identified, of which 11 were crossover studies and the other seven were of a parallel group design. Approximately 80% of the observational studies included in the MAs were conducted in an outpatient population and with patients primarily utilising their existing inhaler device. Baseline data were obtained prior to device training for most of the RCTs.
The majority of extracted studies evaluated MDI (n = 29) or DPI (n = 32) inhalers and reported either the proportion of patients with any error (critical plus non-critical) or those with any critical error. Amongst the DPI studies, most involved Turbuhaler® (n = 17) or Diskus® (n = 15) devices. Fifty-four studies were conducted in patients who were regular inhaler users and assessed the inhalation technique that patients had been employing on a regular basis.
Due to limited data availability, MAs of any device comparisons were not feasible for the: i) overall error frequency (cross-sectional studies and RCTs) and ii) critical error rates for the RCTs.
Most of the studies included in the overall analysis (n = 72) were conducted in the USA (n = 13), the Netherlands (n = 10) and the UK (n = 10).
Amongst all of the studies, about two-thirds reported that the assessor was trained in the inhalation technique of the device under assessment. The assessors included pharmacists, physicians/general practitioners (GPs)/specialists, students, investigators, research assistants and technicians. In the majority (>90%) of cases, inhalation technique was assessed utilising author-validated, existing checklists. Furthermore, a variety of checklists was used for the same device across different studies. These checklists were highly variable and differed in both the number of steps and their definition(s) of these steps. Errors arose from failure to correctly complete the relevant checklist steps for a particular device.
Supplementary Table 2 details the characteristics of all studies that reported overall and critical error rates. The number of studies included in the MA was lower (<72) because studies that reported, i) an error frequency using a definition other than that previously defined, ii) pooled data, or iii) error rates for each individual step, but not cumulatively, were excluded.
Overall error rate (critical and non-critical) Across the devices, error rates appeared common with approximately 50-100% of patients experiencing at least one error. The pooled summary results for the MDI devices estimated an overall error frequency (RE model for all studies) of 86.8% [95% CI 79.4-91.9] of patients with at least one error (Fig. 2a)  at least one error, with a high level of between-study heterogeneity (99.0% I-squared statistic; Fig. 3a). The frequency of overall error rates for individual devices is shown in Table 1. None of the studies assessed the overall or critical error rates for the Breezhaler®, Easyhaler®, Ellipta®, Elpenhaler®, Genuair®, Nexthaler®, or Novolizer® devices; devices with zero studies have been excluded from the table. An MA of overall error frequency for Turbuhaler® and Diskus® in prospective and cross-sectional studies is shown as Supplemental data (Appendix 2, Supplementary Figure 1). The most common overall errors by device are detailed in Table 2. A sensitivity analysis of the overall error frequency for MDIs, conducted to assess any bias due to the inclusion of industry-sponsored studies, gave a similar finding (87.6% [95% CI 79.4-92.9]; RE model for all studies) to the pooled summary (Appendix 3; Supplementary Figure 3).
Critical error rates Across the devices, critical error rates appeared common with approximately 14-92% of patients experiencing at least one critical error. The frequency of critical error rates for individual devices is shown in Table 3. The pooled summary results for MDIs estimated a critical error frequency of 45.6% [95% CI 26.0-66.6] of patients with at least one critical error (n = 10 studies), however, the data were highly heterogeneous (98.4% I-squared statistic) (Fig. 2b). The critical error rates for the DPIs were highly variable for each device: Aerolizer® (n = 4 studies) 14 Fig. 3b). The between-study variability was high (>90% I-squared statistic) for the Diskus® and Turbuhaler® devices. The heterogeneity was lower for the Aerolizer® (44.3% I-squared statistic) and Handihaler® (58.4% I-squared statistic) devices but there were fewer studies available for inclusion, so the between-study variability may be under-estimated.

Impact of patient and study characteristics
The meta-regression analysis showed no significant findings for different baseline characteristics. However, a qualitative review of the extracted studies that analysed data according to, i) patient, ii) disease, and iii) other characteristics, and assessed the association of these factors with the likelihood of making a device error was conducted. A total of 37 primary studies assessed predictors of inhalation technique errors.
Inhaler-related characteristics reported to impact device error rates included prior training on device use, 3,11,26,28,30,31,34,36,40,41 and duration of device use and use of multiple inhalers (n = 3 studies each). 25,29,41 Receipt of prior training on device use was predictive of a lower error frequency compared with no prior training; further, a longer duration of device training compared with shorter duration of training, and receipt of a practical demonstration of correct inhalation technique were factors also associated with lower error frequency rates. A longer duration of device use was predictive of a higher error rate, compared with those who had received their inhaler devices more recently. Use of multiple inhalers also predicted a higher error frequency.
Other characteristics that were reported to affect device error frequency included regular clinic visits, 4, 28 polypharmacy, 18,41 and uncontrolled disease 4, 31 (n = 2 studies each). No trend was observed with regards to patient setting in either overall or critical error frequency.
There was a large variability in the sample size of the included studies. In general, studies with a smaller sample size reported a higher frequency of device errors, particularly critical errors. 26,29,30 There was also a large variation in the number of steps assessed for the different inhaler devices, with studies including between three steps 9 and 12 steps 5 for overall error assessment. The relationship between number of steps and error rate was investigated by ordering forest plots of error rates by the number of steps in each study. There was a trend towards higher error rates with a larger number of steps, however no consistent pattern could be observed.

DISCUSSION
No SR or MA of this type has been previously published. The SR conducted by Lavorini and colleagues 7 was limited as it focused on DPI devices only and an MA of device errors was not undertaken. The present SR of the existing data provides a valuable and timely assessment of the quality of the existing evidence base surrounding device errors. Despite limitations of the data, it can be seen that both the overall and critical error rates are reported to be high across all devices, ranging from 50-100% and 14-92%, respectively. Although there were very limited data on error rates and symptom control/long-term outcomes, one might hypothesise that correct use of the device is fundamental to the efficacy of the drug, and in this case, the reported error rates and critical error rates may result in sub-optimal treatment and disease control. There were insufficient high-quality data to draw definitive conclusions about the comparative error frequency between different devices. Meta-regression analyses of patient and study characteristics were inconclusive. However, previous studies have reported associations between certain patient and study characteristics and device error rates (although these data were not quantified). Some studies have reported that older age 31,33,35 and female gender 35,44 are associated with higher device error rates. Other socio-economic factors have previously been reported to influence device error frequency. A higher level of education has been reported to be associated with a lower frequency of errors. 25,31,36 Additionally, a higher frequency of errors was found in patients with COPD than those with asthma. 15,28 Other factors reported in the literature to impact error frequency were receipt of prior training and type of training, 36 duration of device use 29 and use of multiple inhalers. 27,30 This may be because patients with a longer duration of device use are likely to have only received training/instruction when the device was first prescribed. The higher error rate reported with multiple inhaler usage may have been due to the higher burden and confusion associated with the use of different devices.
Limitations of the available data Given the importance of this area of research, the number of publications focusing on device errors is relatively low, especially when compared with the overall volume of clinical research publications in COPD and asthma. Additionally, there was very little information on the association between device error rates and clinical outcomes. From the available publications, the overall quality was moderate for the majority of studies, with few highquality studies 14,27,30,31,33 (Supplementary Table 2). Whilst it is possible to draw qualitative learning from across the studies, the lack of consistency in studying device errors means that the MAs have to be interpreted with caution.
There were several potential sources of heterogeneity between the different studies including: i) differences in disease diagnosis; ii) heterogeneity due to varying study types; iii) large variability in the level of training received by patients; iv) variability in device step checklists (one of the most important limitations); v) variability in assessors' technique; vi) the subjectivity of each assessment; vii) heterogeneity due to patient-introduced bias (the Hawthorne effect); viii) studies not specifically designed to assess error rates between devices. This inter-study heterogeneity and lack of consensus around error rates and the types of errors associated with different devices may have an impact in the clinic, with healthcare professionals unclear about which inhaler to prescribe for their patients.
In terms of diagnosis, the majority of studies recruited both asthma and COPD patients and did not provide information on the sub-groups according to disease type. 25,27 Moreover, there was no validation of the disease diagnoses reported in the papers. Amongst the cross-sectional studies, patients were generally  14,31 However, in the RCTs, patients were assessed in a controlled manner and included normally recruited device-naïve participants only. 45,46 In the cross-sectional studies, device technique was assessed in patients without any study protocol-specific training or instruction prior to study entry. Moreover, the majority of the studies included in the MA specified a "lack of training regarding device use" as the primary reason for device mishandling. 14, 31 Patients who received training had usually received it at the time of first prescription of the device and did not receive any additional follow-up training or further assessment of their inhaler technique. 31 Additionally, the medical personnel responsible for teaching the correct inhaler technique were reported to be lacking in basic training. 6 Inhalation technique was assessed by a variety of assessors including trained pharmacists, 22, 27 respiratory specialists, 40 GPs, 3,14 or others (including trained assistants to the physician, intern students, or laboratory staff). Additionally, not all of the studies provided information regarding the assessors.
There was also variability in the number of steps, the actual details of the device checklist steps and in the definitions of critical and non-critical errors. For example, for the Diskus® device, the number of overall and critical errors possible varied between 5-12 and 2-4, respectively 29-31, 47, 48 and the frequency of errors for the Turbuhaler® was 4-14 and 3-5, respectively. 2, 10, 30, 49 There was a trend towards higher error rates with a larger number of steps, however no consistent pattern was observed. The definition of critical errors used in the majority (approximately 90%) of studies (e.g., "the proportion of patients with an error for a step that is deemed necessary for the adequate delivery of the drug to the lungs") is also highly ambiguous (i.e., adequate could mean anywhere between 20-100% drug delivery).
There may also be heterogeneity due to patient-introduced bias, i.e., the Hawthorne effect. This is when the patient is being observed for the technique as part of the study/studies where there is a high probability that the patient will try to use the inhaler in the best possible way. 37 This may not result in a true reflection of their daily use and leads to an under-estimation of the error frequency. A few studies specified that inhaler technique demonstration took place in an empty room and that the videotaped observations were assessed by nurses. 50 Finally, although the quality of the data included in these studies was good (Supplementary Table 2), the studies included in these comparisons were largely cross-sectional or prospective in nature and were not designed to compare error rates between different devices. Utilising them in this way necessitates caution. Additionally, a number of the studies had a relatively small sample size, for example, the Batterink study 29 showed significant differences in error rates for comparisons between MDIs and Turbuhaler® and Diskus® devices, but only included ten and five patients, respectively for the latter two devices.
Limitations of the review methodology There are a number of limitations associated with this MA. Firstly, studies that reported error frequency using a definition other than that previously defined were not included in the analysis. This excluded approximately 10% of the studies. Secondly, studies that reported error rates for each individual step, but not cumulatively, were described qualitatively only (approximately 10% of studies). Additionally, studies that reported pooled data, i.e., reported for all DPI devices, were not included in the analysis (n = 3 studies).

Main implications for clinical practice and research
Clinical research. There is a need to standardise the definitions of non-critical and critical device errors and their assessment, as well as improve clarity on the clinical importance of each error. Indeed, the literature highlights that definitions of critical and non-critical errors can vary substantially between different studies. It is essential that the wider scientific community reaches a consensus regarding error terminology for the different inhaler devices and develops a standardised checklist for each device, similar to that which is available in the Netherlands. 51 This will not only be useful for standardising the conduct of future clinical research but also will also provide a valuable clinical tool and enable comparison of devices across future studies.
It should be noted that a number of errors are common to both the MDI and DPI devices, e.g., "exhaling before inhalation" and "holding breath for a few seconds after inhalation". "Ensuring a proper seal around mouthpiece" is common for MDI with spacer, Diskus® and Turbuhaler® devices ( Table 2). It was not possible to identify common errors for Autohaler®, Breezhaler®, Elpenhaler®, Genuair®, Nexthaler® or Novolizer® due to limited data. These steps may provide the basis for the identification and refinement of errors. However, there appears to be more inconsistency in the definition of critical errors.
Once the step checklists and critical/non-critical errors have been standardised, there is a need to conduct more systematic clinical research in this area. There is currently insufficient evidence to be able to determine whether there are any differences in error (non-critical and critical) rates between different inhaler devices. Prospective clinical studies in inhalernaïve patients, using more objective device training, are required in order to address this issue.
During routine clinical practice, inhaler technique errors may be compensated by increasing the medication dose if disease control is sub-optimal, but this has not yet been systematically studied. Conducting a study to assess low medication doses where the impact of critical device errors is likely to be more pronounced would provide a useful approach to assess the impact of various inhalation steps and errors on patient outcomes. Another approach would be to assess treatments that are available across Device errors in asthma and COPD H Chrystyn et al a number of different devices at microgram-equivalent doses, and to conduct real-world studies in COPD and asthma patients (individually and combined). Finally, as the outcomes data are very limited, there is a need to investigate the links between different critical device errors and long-term, clinical effectiveness (patient outcomes), resource use and adherence rates.
Clinical practice. There has been awareness of the problem of device errors for over 40 years 8 but this issue has still not been resolved. The high error rates observed across inhalation devices in this study suggests that more time should be invested by healthcare professionals in educating/training patients on how to operate their inhalers correctly. There are several factors that need to be addressed, including the requirement for standardised training of healthcare professionals and patients for the different inhaler devices and regular re-evaluation of inhaler technique and mastery. Existing guidelines 3, 4, 49 provide targets for device training but these have still not been achieved. The development of standardised training protocols and schedules for the individual inhaler devices may aid this process. There is also a need to assess the ease of training and continued mastery across the different devices.
Once sound research techniques are available, this could lead to improvements in clinical practice whereby training is standardised and conducted on an ongoing basis, with regular re-assessment of patient device handling.

CONCLUSIONS
This SR and MA highlight the relatively limited body of evidence assessing device errors that is currently available, given the importance of this issue. From the available data in the literature, it is apparent that patients are not operating their inhalers correctly. Overall and critical error rates appear to be high across all of the devices assessed: approximately 50-100% and 14-92%, respectively. However, the high level of heterogeneity between studies prevents any definitive conclusions being drawn. There is currently insufficient evidence to be able to determine whether there are any differences in error (non-critical and critical) rates between different inhaler devices. Furthermore, there are limited available data assessing the impact of device errors on clinical outcomes. There is a need to develop and utilise consistent definitions of non-critical and critical device errors and to develop standardised checklists for each individual device in order to facilitate future clinical research and enable comparability between studies. The development of standardised training protocols for the individual inhaler devices will also aid this process.

Data extraction
All citations (titles and abstracts) were screened for eligibility against the pre-specified inclusion/exclusion criteria. Full publications of the included citations were then reviewed for eligibility. All the citations excluded at the title/abstract or full-text screening stage were coded and the reasons for exclusion recorded.

Systematic review
Studies identified from the full-text screening stage that evaluated the number of patients with at least one overall error (critical plus non-critical) or at least one critical error, or assessed error frequency at each step for a specific device, were included in the SR.

Meta-analyses
Studies reporting at least one overall or critical error were included in the MA. For RCTs and prospective observational studies, the baseline error frequency prior to any study-related training or instruction was included in the analyses. Some of the device data were excluded from the MA because they did not meet criteria for the minimum number of studies and/or patients.

Quality assessment
The quality of all the included studies was assessed. The quality of RCTs was determined using the criteria published by the Cochrane Collaboration, with respect to different types of bias, and the quality of crosssectional and observational studies was evaluated using the relevant Newcastle and Ottawa scale [2008]. Decisions about the estimated risk of bias were used to help evaluate heterogeneity between the studies.

Statistical analysis
Data from cross-sectional studies and RCTs were analysed separately. For all quantitative analyses, data were evaluated for the proportion of patients with at least one overall error and those with at least one critical error. Studies that reported error frequency using a definition other than that previously defined and studies that reported pooled data, i.e., reported for all DPI devices, were not included in the analysis. Additionally, studies that reported error rates for each individual step, but not cumulatively, were described only qualitatively. Data were excluded from the analysis if there were fewer than five patients using a particular device within a study. MDI devices were categorised into two different subgroups: i) MDI alone, ii) MDI with spacer.
A random-effects model was used for all the MAs in order to account for between-study heterogeneity.
Meta-analyses of device error frequency for each device/device type All studies selected for analysis, and providing data for an individual device, were included in the MAs. The overall device error frequency was summarised for each device/device type using a restricted maximum likelihood (REML) random-effects MA. These analyses were performed when at least two studies provided adequate data for the same device. It should be noted that heterogeneity may be under-estimated when only a small number of studies were available.
Meta-analyses of device error frequency by sub-groups The MAs described above were repeated by sub-group, i.e., i) diagnosis of asthma or COPD (asthma-only study, COPD-only study, mixed asthma/ COPD study) and ii) previous device use (device-naïve patients, experienced device users, and a mix of naïve and experienced users).

Meta-analyses for comparison of pairs of devices/pairs of device types
Data from all the studies that provided error frequencies for the same two devices (within the same study) were included in the MAs. Pairs of devices were compared using a REML random-effects MA provided that a minimum of five studies for each comparison was available.
Meta-regression of device error frequency for each device/device type Meta-regression is a tool for exploring the association of patient characteristics with outcomes of interest, thereby investigating sources of heterogeneity. Meta-regression aims to discern whether a relationship exists between an outcome measure and explanatory variables.
In order to assess the impact of baseline characteristics on the device error frequency, baseline characteristics reported in the studies were incorporated as regression factors into the REML random-effects MA model. Where available, the following baseline characteristics were considered for inclusion in the meta-regression analysis: i) population mean age; ii) proportion of current smokers; iii) proportion of males. However, as no significant findings were observed, these data are not reported.

Sensitivity analysis
A sensitivity analysis was also conducted to assess any potential bias resulting from the inclusion of industry-sponsored studies. In this analysis, data for inhaler devices that were products of the pharmaceutical company sponsoring the clinical trial(s), were excluded from the metaanalyses. The results of the sensitivity analysis were compared to the original analysis where all relevant studies were included.

ACKNOWLEDGEMENTS
Editorial support in the form of development of the draft outline and manuscript first draft in consultation with the authors, editorial suggestions to draft versions of this paper, assembling tables and figures, collating author comments, copy editing, fact checking, referencing, and graphic services was provided by Julie Wilson of Bridge Medical, Richmond, UK, and was funded by GSK. Additional editing and formatting prior to submission was carried out by Angela Rogers and Emma Landers of Gardiner-Caldwell Communications, Macclesfield, UK and was funded by GSK. This systematic review and meta-analysis was funded by GSK. Medical writing support for this manuscript was also funded by GSK.

AUTHOR CONTRIBUTIONS
H.C. made substantial contributions to the design and implementation of the study and analysis of the results, and is accountable for all aspects of the work. He also prepared and reviewed the manuscript, as well as approving and submitting the final version for publication. J.v.d.P. made substantial contributions to the design of the study and is responsible for the accuracy and integrity of the study. He was also closely involved with the development of the manuscript and approved the final version for publication. R.S. reviewed the publications resulting from the searches and identified those to include. He also made substantial contributions to the development of the manuscript, as well as approving the final version for publication. N.B. made substantial contributions to the study design and analysis of the results. He was also closely involved with the development of the manuscript and approved the final version for publication. B.D. planned and carried out the statistical analyses. He also assisted with the development of the manuscript, as well as approving the final version for publication. A.M. made substantial contributions to gathering data and interpretation of results, including the accuracy and integrity of the findings. He was closely involved with the development of the manuscript, including approval the final version for publication. M.T. made substantial contributions to the concept and design of the study, managed the study site in Southampton (UK), analysed data and interpreted the results. He also prepared and revised the manuscript, as well as approving the final version for publication.