Measuring free-living physical activity in COPDpatients: Derivingmethodology standards for clinical trials through a review of research studies

Article history: Received 12 November 2015 Received in revised form 10 January 2016 Accepted 14 January 2016 Available online 19 January 2016 This article presents a review of the research literature to identify the methodology used and outcomemeasures derived in the use of accelerometers tomeasure free-living activity in patientswith COPD. Using this and existing empirical validity evidence we further identify standards for use, and recommended clinical outcome measures from continuous accelerometer data to describe pertinent measures of sedentary behaviour and physical activity in this and similar patient populations. We provide measures of the strength of evidence to support our recommendations and identify areas requiring continued research. Our findings support the use of accelerometry in clinical trials to understand andmeasure treatment-related changes in free-living physical activity and sedentary behaviour in patient populations with limited activity. © 2016 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Chronic obstructive pulmonary disease (COPD) is a chronic, poorly reversible respiratory disease characterised by airflow limitation and deterioration of lung function. According to The World Health Organisation, COPD affects around 64 million people globally and killed more than 3 million people in 2004 [1]. The total annual cost of COPD to the UK National Health Service is estimated to be over £800 million [2]. Respiratory symptoms of COPD include dyspnea, cough and production of sputum. Difficulty breathing can cause COPD patients to have reduced ability to exercise and perform routine activity such as standing up and walking.
Physical activity reduces the risk of many chronic diseases through a number of mechanisms including, for example, improved weight control, enhanced lipid profiles, improved glycaemic control and lower blood pressure [3]. Psychological effects of physical activity include reductions in stress, depression and anxiety [3]. Increased physical activity is also associated with improvements in quality of life [4]. In COPD, inactive patients are reported to exhibit worse exercise capacity, more dyspnea and a trend for worse functional status which can lead patients into a vicious cycle of increased dyspnea, exacerbations, deconditioning, declining lung function and mortality [5,6]. Increasing physical activity in COPD is associated with improved health outcomes including reductions in hospital admissions and respiratory mortality [7].
In clinical trials of COPD treatment, improvements in physical activity and mobility are important secondary outcomes and these are routinely estimated using in-clinic controlled assessments of exercise capacity such as treadmill tests and the six minute walking test (6MWT). In the 6MWT, the distance that a patient can walk in 6 min is recorded, using either a treadmill or an empty corridor circuit. This provides a controlled assessment of functional capacity [8]. For many reasons, 6MWT distance may not always relate to the amount of physical activity and the degree of mobility that the patient actually achieves in their daily living. For drug treatments that improve lung function in COPD patients, increased activity and mobility and the associated improvements in quality of life may be expected. This increased willingness and motivation to conduct discretionary non-essential tasks requiring levels of physical activity may be better measured in the home context using an activity monitor.
Despite this, there has been limited usage of activity monitors to measure physical activity endpoints in clinical drug development programmes for COPD and related indications. Reasons for this are likely due to a number of perceived barriers including: (i) regulatory acceptance of the validity of devices and the data management assumptions made; (ii) scientific understanding of the data recorded and how to derive meaningful summary outcome measures; and (iii) a lack of standards for implementing activity data collection in clinical trial protocols.
Research grade activity monitors retail from $200 to $300 per unit, prices that are not prohibitive within the budget of many pharmaceutical clinical trials. Overall numbers required may be reduced by recycling and re-using devices within an individual study. The logistics associated with device provision and management is a source of additional resource and expenditure when employing activity monitors in a global, multicentre clinical trial.

Regulatory acceptance
Miniaturisation of sensors and circuitry has enabled huge proliferation in the development and commercialisation of wearable and external monitoring devices in the areas of wellness and health. Examples include cardiac and ECG monitoring devices and sleep and activity monitors. Activity monitors and their associated apps and software are growing in popularity for those wanting to improve fitness or manage weight through regular exercise regimens. High accuracy and precision of these devices is less important in the personal health monitoring arena, yet vital in the area of clinical research if the data from these devices are to be used in the accurate characterisation of treatment related effects [9]. Not all commercial devices may have the degree of accuracy and precision that would warrant their use in clinical trials, but some manufacturers have invested significantly into the scientific validation of their devices and associated algorithms, and validation evidence has been published in peer reviewed journals. This provides good evidence of device validity, accuracy and precision that can support the use of such devices in clinical trials. This may be associated with European CE marking approval and/or US 510(k) approval as additional device quality credentials.
In addition to the ability of the device to accurately measure activity and mobility in terms of continuously recorded accelerometer data, regulators will be concerned with how the recorded data were cleaned and summarised, what assumptions were made in doing so, and the relevance of the outcome measures derived.
A growing number of commercial devices provide the raw accelerometry data, but all summarise raw signals into counts and/or estimates of energy expenditure such as METs and kcals. Firmware on the device is responsible for translating raw accelerations into these summary measures, and activity intensity thresholds (e.g. moderate, vigorous) and energy expenditure is then determined with reference to published doubly labelled water calibration curves or VO 2 /HR regression lines. Usually the specific details of the preliminary data processing algorithms are contained within the device and proprietary to the device manufacturer and not disclosed. This firmware is also responsible for filtering noise out of the signalfor example routine vibrations picked up as small accelerations during motor vehicle travel. The extent of published validation work should provide a measure of confidence in the scientific validity of this firmware, although a growing number of accelerometers are beginning to provide access to the raw data enabling researchers to apply standard open algorithms to interpret the accelerometry signals.
Aside from this, regulators will have an interest in how valid data is identified amongst the continuous stream of data recorded. This will include, for example, reviewing the assumptions that were made to determine whether a period where no activity was recorded was due to lack of movement or due to removal of the device, and how missing data were dealt witharising, for example, from periods of non-wear during a day, or missing days of data.

Scientific understanding of the data
Activity monitors provide a variety of variables associated with activity. For pedometers, the most basic measure is the number of steps over a period of time. The equivalent for accelerometers is the number of counts, which measure not just the presence of a movement but also the magnitude of force (acceleration) generated by movement. As indicated above, activity monitors also often use these to estimate energy expenditure in the form of kcals and METs.
There are many different ways in which continuous activity data can be summarised to create relevant summary statistics. As opposed to total counts/steps or total energy expenditure, increased mobility in some patient groups may be better shown in the duration of bouts of gentle activity, such as walking. The total counts per day may not be sensitive to detecting these sorts of subtle changes, but measures such as the number of times a patient walked, the duration of walking achieved, and the number of walking episodes exceeding a defined duration (e.g. 60 s) may be more sensitive to detecting change and improvement [9]. The objective here is perhaps to identify discretionary activity as opposed to more involuntary functional activity, such as going to the bathroom.
In healthier subjects, total counts per day may be a useful measure, but there are also other valuable summary measures that may be sensitive indicators of change in activity, in particular the time spent in different levels of exertion, such as moderate physical activity and vigorous physical activity. Cut-points can be defined that relate to different intensities of activity. Using these, time spent at different exercise intensities, such as light, moderate, vigorous and very vigorous activity, can be calculated. Again, in certain situations, these may provide a more sensitive measure of increased fitness or activity than a measure of total activity or energy expenditure over the day.
In addition, the overall effect of sedentary behaviour on health outcomes is thought to be independent of the degree of moderate and vigorous activity achieved during active periods [10], and therefore characterisation of sedentary behaviour may also be an important health outcome measure.
For application in clinical trials it will be important to carefully consider and define a priori the summary outcome measures that should be calculated. These should be selected based on knowledge of the patient population and how improvements and changes in mobility and activity might be observed. Those designing clinical development programmes will benefit from published studies identifying outcome measures that are sensitive to detecting treatment-related improvements in the patient population of interest.

Standards for implementation
As described previously, a possible barrier to large scale use of activity monitors in clinical trials is a lack of consensus in implementation standards. Standards, for example, apply to the type of device used, the location of the device on the patient's body and the number of days of measurement required for reliable estimation of overall activity.
For example, when it comes to sensor location, common locations include wrist, arm, waist, thigh and foot. Some locations may be chosen with consideration of patient convenience and maximising wear time (e.g. wrist), whereas other locations may be selected for more scientific reasons. For example, thigh placement may provide more robust information on whether the patient is standing or seated, whereas the waist may be less well suited to identify postural changes but may enable optimal estimation of energy expenditure associated with walking activity [11].
This article reviews the research literature to understand how activity monitors have been used in COPD research studies to date. The objective was to identify learnings that may help to guide standards for their use in future research and to help to limit the perceived barriers in large scale implementation in regulatory clinical development programmes.

Methods
Eighty-nine research studies published between 1999 and 2014 were identified through a PubMed search in January 2015 for published articles containing both "COPD" and "Activity" in the article title and/or abstract; with additionally at least one of the following search terms: "Pedometer", "Accelerometer" and "Accelerometry". The list was subsequently reduced to 76 studies  after review articles (n = 2), clinic-only investigations (n = 5), algorithm development and validation studies (n = 1), studies not using an activity monitor (n = 1), studies not investigating COPD subjects (n = 1), and foreign language articles without an English translation available (n = 3) were removed. Of the 76 studies, our review contained 42 cross-sectional studies, 18 intervention studies (either using an activity monitor as a component of an intervention or to measure the impact or an intervention), 3 longitudinal studies and 13 measurement studies (e.g. assessing the measurement qualities of the activity measure).
The methodology used in each of the 76 studies was summarised (Table 1) and included in the review to understand methods used and outcomes measured when utilising an activity monitor with COPD patients. Where the article did not provide information on a specific aspect, this was recorded as "not reported".
The full metadata description of all studies included in this review is available on request from the first author.

Activity monitor: model
The most commonly used activity monitors reported by the sample of COPD research studies were as follows (Fig. 1a): • The SenseWear Armband device (BodyMedia Inc., Pittsburgh, PA), 11 out of 76 studies, a triaxial accelerometer worn on the arm using a removable armband. • The DynaPort Activity Monitor (McRoberts BV, Netherlands), 11 out of 76 studies, a triaxial accelerometer usually worn on a belt around the waist around the lower back. • The ActiGraph 7164, GT1M and GT3X+ devices (ActiGraph, Pensacola, FL), 7 out of 76 studies. The 7164 is a uniaxial accelerometer, the GT1M is a biaxial accelerometer, and the GT3X+ is a triaxial accelerometer, all usually worn on the waist, although the GT3X + is also commonly worn on the wrist.  The "Other" category contained 22 different activity monitor models each reported once or twice across the sample of articles. One study did not report the model of activity monitor used. At the time of this review, a number of activity monitor device models reported have been superseded by new versions. In many of these cases the new models contain fundamentally the same components and firmware, although some may be associated with performance differences.

Activity monitor: placement location
Over 40% of studies (31/76) used an activity monitor attached to a belt around the waist and located at the waist or hip (Fig. 1b).
Some authors determined that monitors should be located on a specific side of the body, sometimes specifying the non-dominant side although the rationale for the choice of side was not given. On some occasions it was reported that patients were instructed to wear the device in line with the thigh midline. Eleven studies (14%) provided activity monitors that were attached to the arm using an armband. Again, some authors determined that a specific arm was to be used. Ankle attachment was reported by 8% (6/76) and wrist by 4% (3/76) of studies. Multiple locations were used in seven studies where more than one sensor was used, for example hip and arm locations or waist and thigh locations. Single studies also reported housing activity monitors in the trouser pocket of the patient or on the patient's shoe, and two studies at the lower back.

Data collection: period of wear
Huge variation in the number of days subjects were instructed to wear an activity monitor was observed across the sample of studies (Fig. 1c). However, there were little differences in prescribed wear time based on the study typethe median wear time was 7 days for cross-sectional, intervention and measurement studies. Of the six studies requiring activity monitor wear for greater than 30 days, two were longitudinal studies examining changes in physical activity around the time of exacerbation [31,78], and for this reason required longer wear periods of 16 weeks and 6 months respectively. Two were intervention studies where the activity monitor was a component of the intervention [21,55] with 90 day and 14 week wear periods respectively. One study [75] used activity monitors to measure the impact of an exercise counselling programme over a 12 week period, and a further study [43] was a clinical trial of the β2-agonist, indacterol, requiring activity monitor wear for an 8 week period.
In many studies, subjects wore the device for a defined period before or at the start of an intervention or observation period, and then repeated this later to enable within-subject estimation of changes in activity. The length of period of wear varied from a 2 day period (8 of 76 studies) to 6 months continuous wear (reported by a single study). The mean wear period was 15.6 days (sd = 29.0 days), with a median wear period of 7 days (interquartile range: 5 to 9 days). Most commonly, 7 days was selected as the required wear interval (36%, 27 of 76 studies), with 3 days (7 out of 76 studies) and 2 weeks (7 out of 76 studies) the next most commonly reported intervals. There were no reported compliance figures accompanying reports and so it was not possible to understand the impact of wear interval on subjects' willingness and ability to continue wearing the device, although it is assumed that this must have been a challenge for the studies collecting data for multiple weeks. A small number of studies measuring less than seven days of activity specified whether the assessment interval was required to contain one or more weekend days. One study of seven days assessment required the device to be worn overnight on one occasion during the assessment week [80].

Data collection: valid day definition
The majority of studies did not report how a valid day of device wear was determined (36 of 76 studies, Fig. 1d). It is not clear in these cases whether some data were excluded or all data included in the reported analysis, whether data were standardised across patients to allow for missing intervals, or whether some assumptions and imputations were made to allow for missing data. Of studies that did report definitions of a valid wear day, the majority of studies identified a minimum of 10 or 12 hour wear time to constitute a valid day (12 and 11 studies respectively). The mean valid day definition was 12.8 h (sd = 5.6 h), with a median of 10 h (interquartile range: 9.9 to 12 h). Two studies included in the valid day category of 10 h or more defined the valid day as 9.8 h or more (70% of the interval 08:00 to 22:00 h). Seven studies (9%) reported 8 h to be the minimum required, and 5 studies (6.5%) expected a full 24 hour period to be collected for inclusion in their analysis. One study reported that all data were included independent of wear time achieved [70].

Data management: number of valid days required
While requested to wear an activity monitor for a defined wear period, some authors felt that if a minimum number of days wear was achieved within this interval, then this would provide an adequate estimation of the general pattern of activity of the patient. Over two thirds of authors did not report the minimum number of days required (53 of 76 studies, 70%, Fig. 1e). Of those that did report the minimum number of valid days needed this ranged from a minimum of 2 to 7 valid days, with some authors requiring the valid interval to include one or both weekend days. One study [63] excluded the first and last days of their 7 day measurement interval routinely due to the device being provided and returned on these days resulting in incomplete measurement on these days. A further study of 10 days of activity monitor use, examined only the last 7 day period to eliminate reactivity (Hawthorne effect) and to ensure every day of the week was included [73].

Data management: definition of non-wear episode
The majority of articles did not report how non-wear time was identified (67 of 76 studies, 88%, Fig. 1f). Four studies identified a 60 minute period of zero counts as evidence of the device not being worn. One of these studies allowed up to two one-minute epochs to register counts less than 100 amongst the 60 min zero count episode to also qualify as non-wear time, likely to allow for the device being moved or the presence of spontaneous electrical spikes whilst not worn. One study used a 20 min or more period of zero counts as non-wear time [60]. Two studies used more complex definitions of non-wear. In one study [71], the RT3 Tracker (StayHealthy Inc., Monrovia, CA) was assumed to be worn at a specific point in time if at least two of the following conditions based on Vector Magnitude Units (VMU, a measure of acceleration in the direction of travel, calculated as the square root of the sum of the squared accelerations in the three orthogonal directions measured) were satisfied: Of the following 20 min, do at least two have VMU·min −1 values N5? 3. Of the preceding 20 min, do at least two have VMU·min −1 values N 5?
A further study using the StepWatch Pedometer (Orthocare Innovations LLC, Seattle, WA) considered non-wear days as opposed to intervals within a daydefining a non-wear day as a day with fewer than 200 steps recorded over a minimum of 8 h [40]. Two studies [34,77], using the Dynaport (McRoberts BV, Netherlands) and SenseWear armbands (BodyMedia Inc., Pittsburgh, PA) respectively, reported nonwear as estimated automatically by the device firmware, but did not provide the definition used. A further study used patient diary records to indicate periods of non-wear [67].

Data management: missing data methodology
No studies reported methodologies for dealing with missing data aside from the definitions of valid days and valid intervals as described above. In all cases, it seems that patients with insufficient data, or days with insufficient recording intervals, were excluded from the analyses. By eliminating data in this way, non-biased estimates of overall activity could still be obtained if missing data were missing truly at random. This may be the case if non-wear (for example) is due solely to forgetfulness as opposed to other reasons such as participating in water-based activities or periods of poor health, but without additional contextual information this is not possible to determine.

Data analysis: summary endpoints analysed
The types of summary endpoints reported could be categorised into seven broad measurement categories ( Table 2) and discussed below.
Before summarising the activity data recorded, one study adjusted the time spent travelling in a motor vehicle with average VMUs whilst sitting based on self-reported travel times from a patient diary [86].

Total activity
Over two thirds of the studies reviewed (69.7%) reported one or more measure of overall daily activity. In all studies, total activity was expressed as the daily number of steps or accelerometer counts or VMUs. Accelerometer counts are a measure of the frequency and intensity accelerations and decelerations in the vertical axis. Counts are units whose values are specific to each brand of monitor. One study reported total counts recorded across a two day period. Some studies also reported overall activity in MET hours (5 of 76 studies). METs are "metabolic equivalents" defined as the energy cost of physical activities as a multiple of the resting metabolic raterepresented by a value of 1.0. One of these studies compared the time spent at a number of different ranges of MET values between treatment groups [43]. Approximately 12% of studies reported estimates of energy expenditure when summarising overall activity. The majority of these (7 of 9 studies) reported total daily energy expenditure in kcal. Single studies reported the total daily intensity of activity (MET·h) in bouts of activity above a defined MET thresholdeither 1.4 METs [58] or 3.0 METs [76]. A further study reported the total distance walked per day [45].

Time in bouts of activity
Approximately 40% of studies (30 of 76 studies) reported a measure of time spent in defined intensities of physical activity. Activity level thresholds were based on steps, METs or counts per minute, VMUs or (in fewer studies) walking speed. Some cut-off points for activity levels were in line with the scientific literature. For example, cut-off points based on counts reported by validation studies performed by Freedson reported sedentary, light and "lifestyle" activity as b 100, 100-759 and 760-1951 counts per minute respectively [88], based on devicespecific counts thresholds from the vertical axis of the ActiGraph (ActiGraph, Pensacola, FL) device. Most frequently, activity levels were defined based upon METs (around 13% of studies). Because METs indicate the increase in energy expenditure during an activity relative to the patient's resting energy expenditure, some authors consider them to be the most suitable energy expenditure unit to be used [89]. General intensity thresholds based on METs have been reported as 3-6 METs for moderate activity, 6-9 METs for vigorous, and N 9 METs for very vigorous [90]. Tudor-Locke and Rowe [91] identify METs as the measure for which there is greatest consensus in the literature for use in determining activity intensity thresholds. Importantly, they also identify that the relationship between walking speed, or cadence, and METs to be relatively consistent between individuals of varying fitness levels. While these thresholds were used by some studies to define the time spent in different levels of physical activity, other studies used different cut-off points and did not report a rationale for their selection. In one study [44] cut-off points used were based on the mean maximal oxygen uptake during a separate cardiopulmonary incremental exercise test in the same group of patients. The authors felt this approach was advantageous in comparison to previous research in COPD where standard cutoff points for the general population were used, as they suggest that this results in higher relative intensities when applied to older less fit individuals.

Body positions and modes of activity
The time spent in different body positions and associated activity was reported in 19.7% of studies reviewed. Most commonly examined was time spent walking, standing, sitting and lying (12, 6, 9, and 7 studies respectively). Single studies also reported the percentage of the day spent lying or sitting [17], and time spent moving but not walking [45]. One study also reported the maximum time spent lying, sitting, standing and walking in any single period/bout in that state [32]. While reported by the firmware within many accelerometers, it is unclear how accurately sitting, standing and lying states can be identified, in particular by those worn on the waist, wrist, arm or ankle, as was the case in the majority of studies in this review. In these positions, the spatial positioning of the accelerometer may not change significantly between sitting and standing states. In addition to the cut-off points described above, time spent sitting or lying is a further measure of sedentary behaviour.

Postural transitions
One study using an accelerometer with sensors on the thigh and chest explored the number of times COPD patients changed postureeither standing up from seated or standing from lying down [32]. While multi-sensor devices facilitate this kind of investigation, most researchers use commercially available instruments which generally comprise single sensors. Currently, the standard device for measuring postural transitions, the ActivPAL (PAL Technologies, Glasgow, UK), distinguishes standing from sitting or lying and work is in progress to differentiate between sitting and lying (Douglas Maxwell, PAL Technologies, personal communication). Postural change is an interesting outcome measure that may be sensitive to detecting changes in discretionary nonroutine daily activity in activity-compromised patients, and one that is seeing increased use in public health research. For example, Carlson et al. [92] identified that increasing postural transitions by including an additional 10 breaks in sedentary bouts per day resulted in a systematic improvement in waist circumference, blood pressure, HDL, triglycerides, fasting glucose and insulin in adults.

Time inactive/active
Seven studies (9.2%) looked at the time spent active over the day, and two studies reported the time spent inactive (based on intervals of zero steps). The studies reporting time inactive based on intervals of zero activity did not provide a definition of non-wear time and it is unclear whether inactive time may be overestimated by including periods of non-wear.

Intensity of activity
Twenty studies (26.3%) reported some form of outcomes measuring the intensity of activity achieved. Almost half of these reported the daily physical activity level (PAL) which was calculated as the total daily energy expenditure divided by the resting energy expenditure (9 studies). In some cases, resting energy expenditure was estimated by sleeping energy expenditure as patients were instructed to wear the activity monitor for the full 24 hour period (e.g. [14]), or by estimation using gender-specific Harrison and Benedict prediction equations [93]. One study examined whether a target PAL of greater than 1.7 was achieved [77]. The International Association for the Study of Obesity recommends a PAL score of 1.7 or above to represent an amount of daily activity sufficient to prevent the transition to overweight or obesity [94].
Six studies derived the mean activity intensity by dividing total counts by wear time, providing an estimate of counts per hour. Three studies also considered the average movement intensity during walking, measured in g, and one in average metres per second. Two studies reported peak performance, defined as the maximum number of steps achieved in a single 1 minute epoch [53,68]. One of these studies also derived a similar summary statistic, "mean max 30", defined as the mean number of steps achieved across the 30 1-minute epochs recording the highest step counts [53]. This has parallels with the concept of peak stepping cadence developed by Tudor-Locke and Rowe [95].
Single studies also summarised activity intensity by considering the intensity during bouts of moderate and vigorous physical activity (MVPA). In one study [51], rather than reporting the total time in MVPA, outcome measures included the number of MVPA bouts exceeding 10 min in duration, the mean bout duration and the mean intensity of bouts (counts per minute). This study also examined the proportion of patients achieving the physical activity recommendations of the Centres for Disease Control and Prevention, the American College of Sports Medicine [96], and the British Association of Sport and Exercise Sciences [97] that existed at the time. These required at least 150 min per week of MVPA (counted in bouts exceeding 10 min). A second study examined the number, duration and intensity of activity bouts exceeding 10 min above thresholds of 1.5, 2.6 and 3.4 METs [58].

Daily activity profile
Two studies explored the profile of activity across the day. One study [47] examined the average daily activity profile curves of COPD subjects compared to controls, and also analysed the activity recorded in three segments of the day: morning (8:00 h to 13:00 h), afternoon (13:00 h to 17:00 h), and evening (17:00 h to 20:00 h). A second study [71] evaluated individual activity-time profiles by deriving average 24 hour time-profiles for each subject. These were calculated by taking the activity observed within each hour of the day and averaging this across all of the days observed. Perhaps fundamental in deriving these average profiles is an assumption that certain activities, and periods of inactivity, regularly occur at a similar times of day.

Discussion and recommendations
Large scale implementation of activity monitors in regulatory clinical trials requires accepted and consistent methodology for how these wearables are used and how the data they generate is interpreted. Usage standards and methods are likely to vary according to the patient population, disease indication and perceived treatment benefits. This review of activity monitor use with COPD patients has shown little consensus and much variability in many of the methodology areas we have identified as important in clinical trial implementation. While consensus can be derived in some areas of methodology, we weigh this against a more detailed consideration of the limitations imposed by the disease along with other key works in activity measurement. We propose standards and the rationale behind our recommendations, and include with each a measure of confidence in our recommendations based on the strength of associated scientific evidence found in our review or elsewhere in the scientific literature (Table 3). This level of confidence is summarised using a red, amber or green categorisation. A green classification represents high confidence in the recommendations based on the high consensus in our review and/or strong scientific evidence elsewhere. Amber classification represents, in our opinion, a reasoned approach where a number of different alternatives have been reported. Where we have classified an item as red, we propose that the recommendation has merit but that more research and evaluation is needed to provide a standard for future research.
Tudor-Locke et al. in their evaluation of activity in special populations [98] found activity levels amongst COPD patients to be one of the lowest amongst the groups they studied. Median steps per day were 2237 steps/day in COPD studies compared to 8008 steps/day amongst subjects with type 1 diabetes. Similarly, Walz et al. [77] demonstrated that few patients in each category of COPD severity (GOLD I-IV) recorded a PAL exceeding 1.7, implying sedentary activity levels across all levels of disease severity. Treatment-related improvements in lung function may directly relate to increased willingness and motivation to participate in recommended exercise regimens, and in particular reduce the quantity of sedentary behaviour and its impact. Aerobic exercise (riding an exercise bicycle or walking) and resistance exercise (lifting light weights with the arms and legs) can help restore and maintain functional independence in COPD. The American College of Sports Medicine (ACSM) supports the viewpoint that light to moderate physical activity (30 min a day, on most, if not all days of the week) is beneficial for improving the quality of life in persons with COPD [99].
More recently, it has been shown that sedentary behaviour and moderate and vigorous physical activity are independently related to functional fitness (see for example [10,[100][101][102]). In essence, periods of sedentary behaviour have a negative impact on functional fitness even when periods of activity and exercise are undertaken. This has associated impact with chronic disease progression and risk factors. It would seem appropriate that characterisation of sedentary behaviour in addition to physical activity is important in most cases, and perhaps particularly so in indications such as COPD where barriers to exercise exist.
The above suggests that in COPD, measuring changes in sedentary behaviour, and the daily time spent in light or moderate activity may be important elements to consider when assessing treatment related effects and associated improvements in quality of life. It is assumed that subjects providing the minimum number of valid days provide sufficient data to provide reliable estimates of daily activity. Less than this amount of data does not provide sufficient information for robust estimation of overall activity. We do not recommend modelling or other imputation methods for subjects with an insufficient number of valid days as this approach has not been validated in this population. Without contextual information explaining the reason for missing values the best unbiased approach to missing data estimation cannot be determined.

Sedentary behaviour
Due to the level of activity expected in the patient population, treatment related effects on overall activity are most likely to be seen first in changes in sedentary behaviour and light/moderate activity.

Red
Light/moderate activity Our proposed measures of time in physical activity based on cadence of 60 steps/min or greater for minimum intervals of 2 min and 5 min are intended to measure periods of time in rhythmic, continuous ambulatory activity as opposed to sporadic pottering activity. This threshold is an achievable cadence for COPD patients even at more advanced stages of the disease, based on 6MWT data. We identify two bout intervals as the duration of 5 min may not be achievable by all patients, this longer bout measuring greater endurance in patients with higher fitness levels. The associated metabolic stimulus of this form of exercise is different to pottering behaviour and important for cardiovascular health as opposed to functional activity. Total steps and cadence (steps/min), unlike counts, are common across instruments, interpretable and easily related to programme goals.
• Number of walking episodes of at least 1 minute duration per day Amber • Mean walking episode length Amber • Maximum walking bout length per day Red • Mean walking cadence (steps/min) Red • Time in physical activity (cadence ≥60 steps/min for a minimum interval of 2 min and a minimum interval of 5 min)

Red
Overall activity • Total steps per day Green a Green: high confidence in the recommendations based on the high consensus in our review and/or elsewhere; amber: a reasoned approach where a number of different alternatives have been reported; red: recommendation has merit but more research and evaluation is needed to provide a standard for future research.

Measuring changes in sedentary behaviour
There are a number of considerations regarding the measurement and characterisation of sedentary behaviour. Epoch length, for example, has been shown to affect the estimation of total sedentary time. Atkin et al. [103] reported that studies exploring epoch length for measurement of sedentary time are inconsistent in their conclusion, but in general a shorter epoch length is recommended. Accelerometers providing raw data summarise the data into epochs using the device software and enable different epoch lengths to be applied post hoc.
A key component of measuring sedentary time is understanding posture. Some authors conclude that devices measuring only activity and not providing reliable posture information may be less accurate in measuring sedentary behaviour than devices able to measure both components [104]. For example, time spent standing still is not sedentary, although valuable in health in breaking sedentary behaviour, whereas time spent sitting or lying is sedentary. In addition, sleeping is not generally detrimental to health and therefore should not be classed as sedentary behaviour. It is therefore important to be able to distinguish time spent sleeping from wakeful lying down, although doing so reliably may be out of scope of current activity monitor algorithms. Where measurement of sedentary behaviour and treatment related changes is important, a monitor positioned on the thigh is of greatest value as this can most effectively determine sedentary postures from the inclination of the thigh in addition to activity performed. Recent ongoing algorithm development work for ActivPAL, a thigh-positioned accelerometer, may also help to distinguish lying from sitting, and to some extent wakeful lying from sleeping lying, due to other thigh movements characteristic in each posturesuch as thigh rolling when lying (Douglas Maxwell, ActivPAL, personal communication).
Few devices support convenient thigh placement, but the ActivPAL is a device that has been developed specifically for this with the objective of measuring sedentary behaviour and postural changes. This device, about the size and shape of an SD card, is attached to the thigh of the patient within a nitrile sleeve and beneath a waterproof dressing (e.g. Tegaderm™ transparent dressing) which provides permanent and waterproof attachment for 7 days which is the recommendation of this article. This kind of permanent attachment has the additional advantage that it enables the full 24 hour period to be studied without wear/non-wear compliance considerations. This approach is generally well accepted by patients, with a small number reporting skin irritation associated with the dressing and requiring its replacement to maintain the required recording period.
Determining clinically relevant measures of sedentary behaviour from the rich stream of accelerometer data recorded is an area of current research. Ideally, one or more derived measures should assess both the total amount of sedentary time and the length of bouts or the degree to which sedentary periods are broken up by changes in posture or periods of activity. Additionally, sleep time should be determined in order to differentiate this from wakeful lying.
There remains uncertainty about how to characterise clinically relevant periods of sedentary behaviour. For example: a) How long does sitting have to continue in order to constitute a clinically relevant bout of sitting? b) How long does standing/walking have to continue to constitute a clinically relevant "break in sitting"? c) Does the answer to question b) depend on the answer to question a), and vice versa? For example, if a subject sits for 20 min but breaks this by standing for 5 s, does this constitute two 10-minute bouts of sitting, with one meaningful break, or one 20-minute bout of sitting, with an inconsequential standing event in the middle?
Understanding the answer to these questions is the focus of active research. Some authors have proposed summary measures to measure sedentary behaviour. Tieges et al. [105], for example, recommend three summary statistics to measure sedentary behaviour from activity monitor data: Tieges et al. describe the Fragmentation Index as a single summary measure describing the pattern of accumulation of sedentary time, where a higher Fragmentation Index indicates that total sedentary time is accumulated from a higher number of shorter bouts rather than a few prolonged periods. This index is calculated in the same way as the Sleep Fragmentation Index that is used to assess sleep arousal in sleep studies using the number of arousals and total sleep time.
The latter two measures proposed by Tieges et al. show potential promise in that they attempt to summarise both the total sedentary time and the number of bouts or bout length into single indices. However, more needs to be understood about the properties of these measures, for example when calculating the weighted median how much data (how many bouts) is needed to generate reliable estimates for each subject? In addition, summary statistics need to be related to health outcomes and be sensitive to enable detection of changes associated with improved or worsening of those outcomes. The proposed Fragmentation Index, for example, would return the same value of 1.0 for a subject with a single one-hour bout of sedentary behaviour and a second subject with 10 bouts of 1 h. It would seem likely that these scenarios would be associated with different health outcomes. In addition, as the Fragmentation Index is the inverse of the mean bout length, it is likely to be well correlated with the median bout length and total sedentary time measures proposed and so may offer little additional value. More research will be required to determine the most clinically relevant sedentary measures.
Based on what is currently known about summary measures for sedentary behaviour, we suggest that the total sedentary time, the maximum sedentary bout, and the number of postural transitions from sitting/lying to standing/walking are of value to characterise changes in sedentary activity and capture changes in discretionary activity. It is acknowledged, however, that more research is needed to better understand the specific properties of sedentary behaviour that relate to diminished health outcomes (such as the time of day of sedentary periods, the length of sedentary periods, the number of breaks in sedentary behaviour and the minimum beneficial break length). This will lead to a greater ability to identify the most clinically relevant measures to describe sedentary behaviour and its impact on health.
Sleep quality is affected in COPD which may lead to daytime sleepiness and daytime napping and extended periods of wakeful lying. As described above, napping does not have the same detrimental effects as being seated or lying while awake and so it may become increasingly important to be able to measure sleeping episodes, both nocturnal and during the day, alongside activity and posture information to accurately characterise sedentary behaviour. Because daytime napping occurs in COPD, standard time algorithms, or daily logs to determine bedtime and morning wake-time, are likely to be insufficient in this population.

Measuring light/moderate and overall activity
In a potentially very sedentary population, measures such as overall activity expressed as daily counts, energy expended, or time spent being active may be less sensitive to detection of small but important changes in mobility. Quality of life improvements may be achieved from modest short discretionary activities that a subject elects to perform in addition to required functional activities performed out of necessity, such as getting up to walk to the bathroom. People with short strides or a shuffling gait, seen in some COPD subjects, have less vertical displacement of their centre of gravity while walking, making it more likely that pedometers or waist-mounted accelerometers may underestimate steps or activity [106]. However, the volume of research studies examined did not identify this as a limitation of using accelerometry in this patient population.
As described previously, COPD patients are recommended to follow an exercise schedule that may consist of aerobic exercise and resistance activity. We recommend that in addition to the overall number of counts per day, researchers investigating activity changes in COPD clinical trials also explore the number and duration of walking episodes. Thought should be given to the ability to detect the difference between cycling and stepping if some patients elect to use an exercise bicycle to follow their aerobic recommendations. Early validation work to distinguish between cycling and stepping has been performed, for example using the ActivPAL [107], and further algorithm development work is anticipated to enable greater precision in determining different forms of exercise. We propose that calculation of the number of walking episodes per day, the mean and maximum walking bout length, and mean walking cadence are sensible measures of light and moderate activity in these patients.
In line with The American College of Sports Medicine recommendations of 30 min of light to moderate activity per day for COPD patients, we additionally propose that the length of time per day in bouts of at least light physical activity (LPA) is important. Our review found a large variation in the definition of cut-points for different intensity categories and their interpretation. Count thresholds are device-specific and researchers need to refer to validation studies on specific devices to determine the appropriate count cut-points to define LPA. For Actigraph (Pensacola, FL) devices, 100 cpm appears a common threshold to define the transition point from sedentary behaviour to light activity in our review, although a recent study has proposed 150 cpm as a least biased estimate of the sedentary/light activity threshold, although this is based on a small sample of subjects [108]. While acceleration measures, such as counts, provide sophisticated measures of both the presence and magnitude of activity, counts have less direct interpretability. While not reported in the studies we reviewed, we propose intensity thresholds based on cadence to provide understandable and interpretable measures that can be easily adapted into guidance and targets, and are well correlated with other intensity threshold measures [95]. To distinguish between sporadic pottering activity and a period of sustained rhythmic, continuous, purposeful ambulatory movement we propose two cadence bout thresholds of 60 steps/min for at least 2 min and 60 steps/min for at least 5 min. The cadence threshold of 60 steps/min is reasonable for patients with COPD, based on published evidence in controlled walk tests. For example, Annegarn et al. [109] reported cadences in excess of 100 steps/min during a 6MWT for patients at all stages of COPD. Similarly, in three other studies [110][111][112], participants with COPD completed the 6MWT at speeds typically corresponding to cadences over 100 steps/min (based on prior evidence of the speed/cadence relationship [91]). We identify two bout intervals as the duration of 5 min may not be achievable by all patients, this longer bout measuring greater endurance in patients with higher fitness levels. In the study reported by Annegarn et al. [109], for example, 15% of patients were unable to complete the 6-minute test without resting. A further study [113] similarly showed that 4/7 patients rested during a 6MWT, some for more than 1 min. We recommend additional research to determine whether one or both bout duration measures are most useful in this population. These measures represent a sustained period of stepping as opposed to a simple step accumulation target as differentiated by Stansfield et al. [114]. The associated metabolic stimulus of this form of sustained exercise is different to pottering and short breaks in sedentary behaviour, and is important for cardiovascular health as opposed to functional activity. This period of sustained stepping is also likely to indicate an excursion out of the house, which is associated with quality of life components. Accelerometers to measure steps and cadence must provide accurate step counting and time-stamped stepping data such as steps per minute or time spent above specified stepping rates. Tudor-Locke and Rowe [95] summarise the accuracy and ability of a number of commercial devices to do this, and many common commercial accelerometers provide suitable sensitivity including the ActivPAL and Actigraph devices.

Period of wear, valid day and number of valid days needed
The most common period of wear reported in the articles within this review was 7 days. This is an intuitively logical choice as it captures weekdays and weekend days. Hart et al. [115] estimated that 5 days of measurement in older adults is sufficient to enable representative measurement of physical activity and sedentary behaviour in older adults. As described above, COPD patients are amongst the least active of many special populations, and it would seem reasonable that the recommendations of Hart and colleagues would be relevant to apply here. While activity on weekends, or non-working days, can be quite different to that of working days in some populations (see [9] for example), in more compromised patient populations there may be less ability to work and the difference between weekdays and weekend days may not be relevant. We conclude that a measurement interval of at least five days evaluable data, with no requirement to measure a specific number of weekend or weekday days, would be acceptable in COPD subjects.
While it is desirable to use a device that is not removed by the patient, and thus records the full 24 hour interval; where this is not possible, we feel the consensus of a minimum of 10 hour wear time per day may not be sufficient to accurately characterise activity and mobility. Herrmann et al. [116] used datasets from the National Health and Nutrition Examination Survey (NHANES, [117]) to explore the impact of wear time on activity estimates. Their examination of data from around 4000 individuals concluded that using 12 h or less wear data significantly reduced estimates of time spent in activity and sedentary behaviour and may potentially affect estimates of individuals meeting activity recommendations. They concluded using more than 12 h of activity data per day ensures accurate estimates of daily physical activity. For certain devices this may represent a compliance concern and an impractical approach that would result in insufficient evaluable data. For example, Tucker [118] reported wear compliance of a waist-worn accelerometer using NHANES data as 18% for 7 or more days with at least 12 h per day; versus 33% for at least 10 h a day. However, if sedentary behaviour is measured, as we recommend for this population, a thigh-based monitor is optimal and these can be conveniently affixed to the body for the full 24 hour period under a discreet waterproof dressing as described for the ActivPAL device above.
Kang and Rowe [119] estimated the time spent in different physical behaviours based on aggregated data from a number of studies in adults, using data from NHANES. They identified that on average 0.4-3.0% of time over a 24-hour period is spent in MVPA, 27-41% is spent in LPA, 31-39% is spent in sedentary behaviour and 28-31% in sleep. As LPA and sedentary time accounts for so much of the overall 24 hour period, obtaining accurate estimates of the daily time spent in these states may require a lengthier sampling interval. Obtaining a reliable estimate of daily sedentary and LPA times, however, may be achievable using a shorter sampling interval and adjusting for the period of wearfor example, estimating the proportion of wear time that accounts for sedentary behaviour, or standardising the sedentary time measured to a standard waking day of 16 h.

Definition of wear and non-wear
Accurate identification of non-wear episodes is an important consideration in accurate measurement of sedentary time. It is possible that while assuming that a period of zero counts represents non-wear, it may be possible to wrongly associate a period of sedentary time with a non-wear episode. Ideally, devices with wear sensors or devices that are adhered to the body (e.g. ActivPAL) may eliminate this potential challenge.
Most non-wrist-worn devices do not contain skin sensors to determine accurately when devices were worn and removed. For this reason, assumptions regarding the length of time that a device recording no activity should constitute a period of non-wear need to be made. Where reported, 60 min of zero counts was the most common, with the allowance of two one-minute epochs with non-zero counts not exceeding a certain device-specific threshold (e.g. 100 cpm for the ActiGraph device) appropriate to allow for the device being moved whilst not worn or for random electrical disturbance. Studies in children consider shorter intervals as evidence of non-wear due to the level of movement expected in younger individuals.

Validation of endpoints for clinical drug licencing submissions
Activity data measured using an activity monitor falls into the Performance Outcomes (PerfO) category of Clinical Outcome Assessments (COAs). It will be required by the Pharmaceutical Company (Sponsor) to provide an evidence dossier to support the use of the activity monitor in the patient population studied. This will include supporting evidence of the device's measurement properties to show that the device is able to accurately measure the construct of interest (for example activity, posture etc.). Generally this will be obtained from the literature, where some devices have been subject to published robust validation studies in a variety of patient populations to demonstrate the validity, accuracy and precision of the device measurement and firmware algorithms. Sponsors should select a device with sufficient validation evidence to satisfy this requirement.
It will also be required of the sponsor that derived endpoints summarising the activity data are clinically meaningful in the target population. It is less likely that this information can be obtained directly from a manufacturer in all cases, but over time the use of certain summary endpoints in published studies may provide a starting point. Most likely, data using activity monitors during phase II of the drug development process will be useful in demonstrating that the summary endpoints derived are clinically meaningful, for example that they correspond to other treatment related improvements and quality of life changes.

Conclusions
In this article we suggest that the widespread use of activity monitors in pharmaceutical clinical trials may be limited by uncertainties over regulatory requirements concerning the validity of devices and the data management assumptions made (such as definitions of valid days), the clinical relevance and validity of derived summary outcome measures, and a lack of agreed standards for implementing activity data collection in clinical trial protocols.
While not the focus of this article, we recommend that devices selected for use in clinical trials should be confined to those for which there is a body of validation evidence published by researchers and/or device manufacturers.
In terms of data management assumptions, the development of meaningful derived summary outcome measures, and standards for implementation, while the research literature contains many examples of the use of activity monitors in COPD studies, the research community have yet to converge on a set of standard methodologies, despite their value in measuring important health outcomes for pharmaceutical clinical trials. This article serves to summarise current practise and provide recommendations based on the research literature for a set of implementation and data management standards, and a set of pertinent derived health outcome measures considered important in describing changes in activity in this patient group. Throughout the recommended standards, areas where additional research is needed to provide more definitive conclusions is identified.
These proposed standards may help to drive the use of activity monitors in measurement of free-living activity in COPD clinical drug development programmes, and may have application beyond COPD and apply equally to other disease indications characterised by low intensity of physical activity and high degrees of sedentary behaviour.

Appendix A. supplementary data
The full metadata description of all studies included in this article can be found online at http://dx.doi.org/10.1016/j.cct.2016.01.006.