Identifying adults’ valid waking wear time by automated estimation in activPAL data collected with a 24 h wear protocol

The activPAL monitor, often worn 24 h d−1, provides accurate classification of sitting/reclining posture. Without validated automated methods, diaries—burdensome to participants and researchers—are commonly used to ensure measures of sedentary behaviour exclude sleep and monitor non-wear. We developed, for use with 24 h wear protocols in adults, an automated approach to classify activity bouts recorded in activPAL ‘Events’ files as ‘sleep’/non-wear (or not) and on a valid day (or not). The approach excludes long periods without posture change/movement, adjacent low-active periods, and days with minimal movement and wear based on a simple algorithm. The algorithm was developed in one population (STAND study; overweight/obese adults 18–40 years) then evaluated in AusDiab 2011/12 participants (n  =  741, 44% men, aged  >35 years, mean  ±  SD 58.5  ±  10.4 years) who wore the activPAL3™ (7 d, 24 h d−1 protocol). Algorithm agreement with a monitor-corrected diary method (usual practice) was tested in terms of the classification of each second as waking wear (Kappa; κ) and the average daily waking wear time, on valid days. The algorithm showed ‘almost perfect’ agreement (κ  >  0.8) for 88% of participants, with a median kappa of 0.94. Agreement varied significantly (p  <  0.05, two-tailed) by age (worsens with age) but not by gender. On average, estimated wear time was approximately 0.5 h d−1 higher than by the diary method, with 95% limits of agreement of approximately this amount  ±2 h d−1. In free-living data from Australian adults, a simple algorithm developed in a different population showed ‘almost perfect’ agreement with the diary method for most individuals (88%). For several purposes (e.g. with wear standardisation), adopting a low burden, automated approach would be expected to have little impact on data quality. The accuracy for total waking wear time was less and algorithm thresholds may require adjustments for older populations.


Introduction
Excessive time spent in sedentary behaviours-sitting or reclining while awake with low energy expenditure (⩽1.5 metabolic equivalents) (Sedentary Behaviour Research Network 2012)-has been associated with several chronic diseases and premature mortality (Thorp et al 2011, Wilmot et al 2012, Cong et al 2014, Shen et al 2014, Biswas et al 2015. Evidence regarding the health consequences of sedentary behaviour and intervention effectiveness can be improved with the use of monitors that can assess time spent in sedentary behaviour objectively and accurately during free-living conditions. The activPAL is a small, unobtrusive, thigh-worn monitor that can meet such a need, by accurately measuring periods spent in sitting/lying posture (Lyden et al 2012). However, the device output includes periods of sitting/ lying that do not constitute sedentary behaviour, such as sleep and non-wear.
The methods researchers have applied for sleep and non-wear removal as identified in a recent review (Edwardson et al 2016) are varied and mostly high burden, limiting accuracy and the feasibility of collecting sedentary behaviour measures. For continuous (24 h) wear protocols, usual practice has involved excluding diary-reported sleeping periods (Ryan et al 2011, Alkhajah et al 2012, Craft et al 2012, Gorman et al 2013, Reid et al 2013, Berendsen et al 2014, Aguilar-Farias et al 2015. These low-burden methods have no published validity and key limitations (Edwardson et al 2016). One study (Godfrey et al 2014) excluded very long sitting/lying bouts (>8 h) from their sitting estimates, claiming they were likely sleep. However, sleep can be <8 h and interspersed with movement. Activity has been examined during assumed waking periods (Godfrey et al 2014, Smith et al 2014, Barreira et al 2015b, such as 08:00-20:00, that unlikely apply to every individual every day. Recently, Chastin and colleagues (Chastin et al 2014) estimated each individual's waking day as beginning with the first standing bout after ⩾2 h of continuous sitting/lying during the hours 00:00-09:00, and ending with the first bout of standing before >3 h of sitting/lying after 22:30. However, sleep can be non-nocturnal, such as can be the case for older adults with polycyclic sleeping patterns, or for shift workers. In overnight-removal protocols, researchers only need to identify non-wear. Acceleration data have been used to this end (Harrington et al 2011, Barreira et al 2015a with an unknown degree of validity.
To address the need for validated, low-burden automated methods, we created a simple automated algorithm to isolate adults' valid waking wear periods within activPAL data collected with a continuous wear protocol. It was developed and refined using data from a study of UK overweight/obese young adults aged 18-40 years, then tested in a large, population-based study of Australian adults aged 35-89 years. In the absence of a feasible gold-standard for free-living data, we compared the algorithm with usual practice (a diarybased method). We also considered our validity findings in light of an automated method (van der Berg et al 2016) that emerged after our review (Edwardson et al 2016, van der Berg et al 2016.

Methods
Development and validation studies conformed to the Declaration of Helsinki. All participants signed written informed consent. Ethics was approved by the Nottingham National Health Service Research Ethics Committee (Sedentary Time and Diabetes, STAND) and the Alfred Health Human Ethics Committee (Australian Diabetes Obesity and Lifestyle Study, AusDiab).

Algorithm development study
The STAND study (Wilmot et al 2011) included 187 overweight and obese adults aged 18-40 years (n = 125 with relevant data). Participants wore the activPAL3 ™ monitor continuously, 24 h d −1 , for 10 d. Monitors were waterproofed, then attached to the mid-line anterior aspect of the right thigh by an adhesive medical dressing. Detailed wear and re-attachment instructions were provided to participants along with a paper diary to record the times they went to bed, went to sleep, woke up, arose from bed, and any times they removed their monitor.

Automated algorithm development
The algorithm development process was iterative and collaborative, with a strong element of trial and error. Firstly, the investigators evaluated the current practices in the literature and the procedures and experiences in removing sleep and invalid data that have been used in our studies to date (published and unpublished) employing the activPAL monitor (Edwardson et al 2016). The relevant underpinning principles and the general algorithm rules were determined, considering current practices as well as salient observations about sleep and monitor performance. These are outlined in figure 1. Key decisions, based on the current state of the field, were that the approach should be simple, focus not on 'when' sleep may occur but on 'what' sleep and removals are (for activPAL data), and fulfil immediate, addressable needs. Accordingly, we decided to develop a simple algorithm based on knowledge of the behaviours (sleep, activity, monitor wear), that could be tested using available data. It removes non-wear periods, non-wear days, and what we have termed 'sleep' from the valid waking wear data. 'Sleep' is the broader period the person spends in bed, from 'into-bed' or 'lights out' time to finally awakening or arising from bed, including brief periods out of bed such as to visit the bathroom. We did not aim to provide sub-classifications of the excluded data, such as sleep versus non-wear, or time asleep by biological definitions versus other time in bed.
Next, specific rules for the algorithm were discussed, trialled and decided upon based on performance in the STAND study data. Early attempts at algorithms implementing specific rules were trialled and reported at conferences. Coding issues were rectified and ultimately a single set of specific rules was chosen (figure 2). The thresholds for the rules (figure 2) can be adjusted for different populations; those we used are reported here. We implemented and report two versions, described for convenience as versions A and B of the same algorithm. These use the same rules, with minor variations that arose because they were implemented in different software by different coders. Assessing two versions evaluates the robustness of the algorithm to minor differences in how the rules may be applied by a different coder and in different software packages. It also provides two sets of freely available source code for use or checking for version A (DB's STATA code; supplementary materials 1 (stacks.iop.org/ PM/37/1653/mmedia)) and version B (EW's SAS code; supplementary materials 2). Figure 2 summarises the algorithm's general and specific rules, and displays the minor differences between the two versions. A glossary (supplemental material 3) contains further information about the key terms and definitions of both versions of the algorithm. The algorithm requires only data that are routinely available in the proprietary activPAL Events files. Events files have a separate row for each continuous period of sitting/lying and standing, and each individual step/stride. The algorithm attributes the entirety of each bout to the day on which the bout begins (see supplemental material 3).

The automated algorithm
The algorithm rules are summarized briefly here. The algorithm's first step finds long periods that are most likely to be sleep or non-wear. Sleep/non-wear bouts were identified as (1) the longest bout per 24 h period (from noon-to-noon each day) that lasts ⩾2 h, or (2) any very long bouts lasting ⩾5 h. This allows sleep/non-wear to occur at any time, any number of times (including never) within a 24 h window. Because sleep can register as multiple periods of sitting/lying interspersed with real or erroneously detected posture changes and stepping, the next step iteratively examines surrounding bouts and determines whether they are more likely additional sleep/non-wear (limited movement) or waking wear (more movement). Bouts were 'surrounding' if any portion was within a 15 min window before or after a sleep/ non-wear bout. All bouts in the sleep window were classed as sleep/non-wear when the window contains any of these: a sitting/lying or standing bout that is long (⩾2 h), or moderately long (⩾30 min) with very few (⩽20) steps in between; a sleeping/non-wear bout; or, posture changes without intervening steps. This step repeats until no more sleep/non-wear is found. The third step identifies invalid days from limited wear and movement, using wear criteria typical of the literature and movement criteria loosely based on prior approaches (Mutrie et al 2012). Specifically, days were classed as non-wear if they met any of these criteria: limited (b) Sleep usually registers as sitting/lying but sometimes as standing (another stationary activity that the activPAL delineates from sitting/lying by estimated leg angle only) and not always as a single bout. It can be interspersed with posture changes and steps (erroneously detected or briefly out of bed).
(c) Waking wear has more frequent posture changes and more steps than sleep.
(i) Do not assume sleep occurs at a particular time of day or even that it does occur.
(ii) Do not assume sleep is sitting/lying. (iii) Search for sleep as very long periods spent stationary (sitting/lying and possibly also standing) and the immediately adjacent activities (which may be more broken-up sleep).

Nonwear
(a) 24-h wear protocol with adhesive attachment means participants typically remove the monitor briefly to change the dressing, or for very long periods when it falls off or when the skin becomes irritated. (b) The unworn monitor is usually laid flat (registering sitting/lying), or can be propped upright (registering standing) or is sometimes carried in a pocket/handbag (recording various activities including steps).
(c) Non-wear typically has fewer posture changes and steps than wear. Some wear time registers on non-wear days, for which a minimum wear requirement can be used or a minimum step requirement (activPAL).
(i) Target only long removals.
(ii) The methods to find 'sleep , will also find most long removals.
(iii) With imperfect wear/non-wear assignation, additional rules for nonwear days are likely required in addition to finding non-wear periods.
(iv) Non-wear days criteria should consider postural variations, steps and amount of wear time (with a 10hour rule as it is common in the monitoring field). variation in activities (⩾95% of waking wear in any one activity); limited stepping (<500 steps); or, limited waking wear time (<10 h). The final step is quality control. We validated our algorithm against a diary-based method. Other possibilities for quality control (not performed here) are shown in supplementary material 4.

Validation study
The AusDiab study was initiated in 1999/2000 as a general, population-based sample of community-dwelling adults aged ⩾25 years (n = 11247) sampled probabilistically from non-rural a Underlined indicates the general rules. Thresholds intended to be modifiable are in bold.
b 'Sleep , refers to the broad period from in bed or "lights out" to wake up or out of bed in the morning, including time in bed not asleep and brief periods out of bed. It is not limited to time biologically asleep. STEP 1: Identify bouts b unlikely part of waking wear -long periods without posture change/movement Definitions: Bouts are continuous periods of any one activity. Version A treats consecutive steps/strides as a single activity; Version B treats each individual stride/step as per the activPAL events as a separate activity. The entire bout and its full duration is attributed to the day on which the bout begins.
Specific rules: Bouts are classed as "sleep"/non-wear (SLNW) if they are either of the following: duration, or duration and the longest bout found per 24-h period from noon-to-noon the next day. The longest per 24-h period is evaluated based on the 24-h period in which the bout began. STEP 2: Iteratively examine surrounding bouts to determine whether these are more likely part of a continuous sleep/non-wear period (inactive) or not (active) Definitions: Surrounding bouts are all bouts that begin within a sleep window of 15 minutes after a SLNW bout finishes or that finish within a sleep window 15 minutes before a SLNW bout starts.
Specific rules: All surrounding bouts are classified as SLNW if the sleep window contains: a sitting/lying or standing bout of (Version A and B), or SLNW bout (Version B), or a sitting/lying or standing bout and , or only posture changes without intervening steps Version A: reclassify as SLNW any standing bout that is found in between a sitting/lying SLNW bout and another sitting/lying bout with no steps in between. Algorithm B: All bouts in the sleep window are classed as SLNW if no steps (only posture changes) are within the window STEP 3: Identify other invalid data -days of limited movement and wear Specific rules: All bouts on days meeting any of these criteria are marked as invalid if the day contains: Any one activity that accounts for of waking wear time, or < 500 steps, or < 10 hours of waking wear STEP 4: Suggested quality controls -checking, error correction Checking example: Visualise the activity during the time periods included and excluded as valid waking wear data (e.g., as heatmaps) comparing against an external data source (e.g., diary) if collected or otherwise examining for plausibility of the classifications.
Error correction example: Adapt thresholds, overwrite specific instances of misclassification Figure 2. An automated approach to estimating valid waking wear periods from activPAL events data collected in adults using a continuous wear protocol a . areas across Australia by a multistage process (Dunstan et al 2002). In the third follow-up (2011/12), 4614 adults aged ⩾35 years attended the onsite assessment at 46 sites across Australia (Tanamas et al 2013). A subsample of 782 participants were fitted at the onsite assessment with the activPAL3 ™ monitor (77% of the 1014 invited to participate) (Healy et al 2015) and valid data (i.e. at least one valid day of wear by the diary-based method) were obtained from 741 (95% of those provided a monitor). Waterproofed monitors were affixed on the midline, one third of the way down the thigh with a breathable hypoallergenic dressing. Trained staff usually affixed the monitors but simply checked the monitor placement was correct for any participant who preferred to self-attach the monitor privately. Participants were asked to wear the monitor at all times over a seven-day period beginning the day after the onsite assessment and to not remove the monitor, even during showering, bathing or swimming, or for sleep unless it was likely to be lost or damaged (e.g. swimming in the ocean). Dressings and swabs to reattach the monitor were provided along with a diary covering sleep (i.e. 'lights out') and wake times, and monitor removals (if any). The monitors were initialised and downloaded using the activPAL software 6.4.1 (PAL Technologies Ltd, Glasgow, UK). Monitors were either initialised to record immediately or in advance, from midnight of the first intended wear day.
Diary data were entered into an MS Access database (n = 776, n = 5387 d), checked for missing times, errors in reported dates (non-consecutive) and times (e.g. am/pm), and converted to the same time-zone as the monitor data. Staff estimated missing sleep/wake times from the monitor (n = 157 participants, n = 299 d), looking for a single or multiple long periods of sitting/lying between days. Checks occurred also after processing, using graphs (heatmaps) of activity classifications over time for every participant, with staff re-checking the diary upon encountering suspicious data (e.g. very long periods in the valid data that look like non-wear). These were classed as wear if the participant had indicated they did not remove the monitor, otherwise non-wear.

Data processing
Data were processed using STATA v14.0 (StataCorp Texas, USA) for version A, and SAS 9.4 (SAS Institute Inc., Cary, USA) for version B. The comparison method was not a gold standard. It was a diary-based method, consistent with usual practice (Edwardson et al 2016), with monitor corrections, as reported previously (Healy et al 2015). Monitor corrections based on surrounding movement were used as diary reporting is often imprecise (e.g. wake at 6 am, which unlikely occurred at precisely 06:00). Events were initially identified as awake and as non-wear if they mostly (i.e. ⩾50%) occurred during these diary-reported periods (e.g. wake to sleep). For example, with a diary-reported waking period of 6 am-10 pm on a particular day, an event that began at 5:58 am and finished at 6:10 am that day would be classed as awake, initially, while an event that began at 5:50 am and ended at 06:02 am that day would not. Non-wear included removals, all time before wake on the first day, and all time after sleep on the last day. Then, the beginnings and ends of sleep periods initially identified were adjusted to not begin/end until the first/last event lasting at least 20 min. The diary criteria reported previously (Healy et al 2015) contained elements that are not appropriate for comparing diary and automated procedures. For better comparability, the diary days were classified as invalid if they had <10 h of waking wear, using the definitions of 'days' as per each algorithm version (see Glossary). To align the diary with version A, the entire bout was treated as 'sleep'/non-wear if any event in the bout was 'sleep'/nonwear according to the diary method.

Statistical analysis
Analyses were performed in SAS 9.4 and STATA 14.0. Significance was set at p < 0.05, twotailed. For each individual, we examined the agreement of the bout classifications as valid waking wear (yes/no) as kappa, frequency-weighted by bout duration (rounded to the nearest second or rounded up to one second) to indicate agreement on an approximately second-bysecond basis. Differences in agreement by age and gender were examined using a non-parametric test of medians. Agreement in average daily waking wear time was assessed using the Bland-Altman approach, with variation in mean differences and error across average values tested using regression models (Brown and Richmond 2005). To indicate the impact of choosing one data reduction method over another, we estimated sitting, standing and stepping time using each algorithm and the diary-based method independently. We examined means and standard deviations of time spent in the various activities, with and without correction for waking wear time and correlations of these algorithm measures with the diary-based measures.

Results
The validation sample participants (table 1; n = 741) covered men (44%) and women of a wide range of ages (36-89 years, median = 57 years), and socioeconomic backgrounds, with 37% working full time and 30% retired. Most were born in Australia or New Zealand (81.6%) and very few reported currently smoking (7%). Many were categorised as overweight (43%) or obese (25%). The average BMI (mean ± SD) was 27.6 ± 5.1 kg m −2 . There were some small selection biases in age, socioeconomic position and waist circumference. The algorithm development study participants (68% female) were younger (33.8 ± 5.6 years) and heavier (BMI 34.6 ± 8.9 kg m −2 ) than the validation sample.
Both versions of the algorithm achieved near identical results for agreement with the diary-based method in the waking wear (yes/no) classifications of each second (table 2). The algorithm achieved a high median sensitivity (0.95), specificity (0.99) and chance-corrected agreement as indicated by kappa (0.94). Agreement was substantial or better (κ > 0.6) for almost all participants (>97%) and was 'almost perfect' (Landis and Koch 1977) for most participants (88%). Agreement with the diary did not vary significantly by gender, but varied significantly by age (p < 0.001), with less (but still good) agreement (median κ > 0.9), seen in those aged ⩾65 years than their younger counterparts.
The same status as to valid/invalid day was assigned to >98% of days that occurred from the diary period onwards (table 3). The pre-diary period was not counted as monitors likely could have been worn for ⩾10 waking hours at this time, since monitors were fitted on the day prior to the first diary day and were sometimes recording at that time. The algorithm excluded <1% of diary-classified valid days as invalid and included only 3-4% of diary-invalid days as valid data. The discrepant classifications were seldom clear algorithm errors (n = 8 d) or diary errors (n = 5 d). Most occurred after the diary period and could reflect algorithm errors, or participants wearing the monitor after they ceased filling in their seven-day diary.
The mean difference and the random error increased significantly with the average of both measures (all p < 0.001) (figure 3). On average, the algorithm (version A and B, respectively), significantly overestimated waking wear time relative to the diary by 31 and 32 min d −1 (i.e. 3% of a 16 h waking day), with 95% limits of agreement of −86 to +149 min d −1 and −87 to +150 min d −1 (i.e. −9% to +16% of a 16 h waking day). Limiting to the days valid by both methods, the correlation in average daily waking wear time with the diary-based method was 0.67 (95% CI: 0.62, 0.72) for version A and 0.67 (0.61, 0.72) for version B.
The mean amounts of waking wear time, sitting, standing and stepping varied by only a small degree (±0-3%) from those obtained by usual practice (table 4). The algorithm gave slightly higher estimates of mean sitting and lower estimates of mean standing and stepping than the diary-based method, with or without standardising the data for waking wear time. The standard deviations for waking wear time by the algorithm were larger by approximately 20% than by the diary method, while those for sitting, standing, and stepping varied only by ±5% or less. Correlations with the diary-based estimates (table 5) were close to 1 for sitting, standing and stepping when standardising for waking wear time and for unstandardized standing and stepping (⩾0.97), were strong for unstandardized sitting time (r = 0.88) and were lowest at r = 0.63 for waking wear time.

Discussion
This study, along with a recent publication (van der Berg et al 2016), present the first attempts at developing and validating automated estimation methods for isolating waking wear time in activPAL data collected via continuous (24 h) wear protocols. Across a very broad range of adult participants, for most individuals, the algorithm agreed acceptably with a referent method that was entirely independent of the algorithm. Notably, this finding was observed in a study population (AusDiab, Australian adults ⩾35 years) that was independent and different from that used for the algorithm development (STAND study, UK overweight/obese adults aged 18-40 years), indicating good generalisability. Collectively, these populations covered most adult ages. The two versions in different software, with different coders and slightly different definitions performed near identically, indicating the algorithm is fairly robust to minor variations in how it may be implemented.
Without a gold standard, both methods can both contribute to disagreement. Nonetheless, the algorithm and diary-based method showed a high degree of agreement in many respects. For the valid days of data that would typically be used to examine physical activity and sedentary behaviour, each second was classed similarly as part of waking wear or not by both methods, with median sensitivity/specificity of 95%/> 99% and chance-corrected agreement of κ = 0.94. The agreement was not constant across all levels of wear time, however, with a population similar to AusDiab in terms of waking hours and compliance, we would expect agreement in average daily waking wear time to be within a few hours for 95% of individuals (e.g. −86 to 150 min d −1 ). This was more disagreement than van der Berg and colleagues saw between their algorithm and self-report waking hours (−1.1, 1.2 h d −1 ) (van der Berg et al 2016), reflecting either a lower level of accuracy, key differences in the validation process and populations, or both.
Our accuracy was comparable with that achieved with other monitors for detecting 'bedrest', for example, a sensitivity/specificity of 97%/97% (waist-worn ActiGraph accelerometer) and 98%/97% (wrist-worn ActiGraph accelerometer) obtained in a small validation sample of youth in a laboratory setting relative to a whole-room calorimeter (Tracy et al 2014). Our agreement was less than has been reported for non-wear algorithms relative to their referent criteria, such as −134, 143 min d −1 in free-living participants against a diary method  and −52, 132 min d −1 against observation in a laboratory setting (Choi et al 2012). This might be expected with a free-living assessment against an imperfect referent, and the additional difficulties in identifying sleep, with a greater degree of movement difference for non-wear versus wear than sleep versus wake.
Older adults move less than younger persons during their waking day (Matthews et al 2008) and are particularly prone to sleep problems such as insomnia (Sivertsen et al 2009). A less pronounced movement difference between sleep and wake may have reduced algorithm accuracy. Tailoring of algorithm rules and/or thresholds to the population's movement patterns may improve accuracy. However, the algorithm relies heavily on assuming that very long periods spent in a single posture predominantly occur during sleep or non-wear. Our 34.6 (4.9), 33.8 34.3 (4.7), 33.5 0.230 Waist circumference, cm 103.3 (13.9), 101.0 102.5 (13.0), 101.0 0.497 a p for difference between included participants (n = 741 in AusDiab; n = 125 in STAND) and those excluded (not selected or did not provide data n ≈ 3873 in AusDiab and not providing data; n = 62 in the STAND study). b For Ausdiab: Current = any amount now and ⩾100 cigarettes in lifetime, Ex = none now but ⩾100 cigarettes in lifetime, and never = smoked <100 in lifetime n = 4507 attendees and 736 participants. Table presents mean (standard deviation; SD), median or sample n (%). For AusDiab data, the mean SD and % are corrected for the complex survey design using survey commands, linearized variance.  a Total n days varies depending whether days were classified based on the day a bout or activPAL event began. b Days valid by algorithm but invalid by diary were for the following (mutually exclusive) reasons: algorithm errors (n = 8 d); diary errors (n = 5 d); and unclear which estimation is correct (remaining n = 54 d by version A and n = 57 d by version B). Wear status was not clear for which n = 48 and n = 50 d that were after the diary period, for days close to the 10 h threshold for waking wear (n = 13 d); and when the difference was from incompatible definitions (n = 3 d). c Diary-valid days were rejected by the algorithms for these reasons (more than one applied at a time): wear time was close to the 10 h threshold; the step count threshold for a valid day was not met by an apparently inactive participant; and, long periods during which participants did not report a removal were identified as sleep/non-wear.
algorithm and any using similar general rules may have limited accuracy in populations prone to extremely prolonged sitting/ lying during their waking hours, who step very little, or who have interrupted sleep patterns.  The algorithm-derived activity measures correlated highly with those derived from the diary method, especially when standardising for waking wear time (⩾0.97). For many purposes, the practical impact of method choice is likely minimal. With correlations for waking wear time of 0.6-0.7 the impact may be more substantial for methods such as compositional analysis (Chastin et al 2015) that rely on estimates of each waking and sleeping activity. Quality controls may be used to increase accuracy. However, even with quality controls, the automated method likely entails less researcher burden than existing practice, avoiding data entry (when collecting paper-based diaries), cleaning and the need to estimate any unreported sleep or wake times.
Our algorithm excluded all but 48 of 2411 (or 50 of 2527) days that occurred after the diary ended as being non-wear without also excessively removing days during the diary period as invalid. This lends some support that a simple minimum wear rule with minimum movement   criteria can screen out unwanted data, though thresholds other than 10 h, 95% and 500 steps should be tested and optimised. The 10 h wear rule is based more in common practice than evidence that it is optimal for removing unwanted data or providing unbiased coverage of a day. Other criteria could potentially lead to less bias and/or more valid days (with better reliability). Study strengths included the large, diverse, population-based sample and the assessment of performance in free-living conditions, in a sample not used for algorithm development, over a seven-day continuous wear protocol reflective of usual practice for this monitor (Edwardson et al 2016). Though not population representative and with some biases in the subsampling and participation, generalisability is likely to be better than typical small-scale validity studies. Relatedly, this entailed an unavoidable weakness in the referent method, as direct observation was not a feasible option, and usual practice was used rather than a gold-standard. For the diary method, errors can include data entry and participants reporting times incorrectly (e.g. imprecise reporting, am/pm errors, not mentioning a removal, or mentioning a removal occurred for the other monitor they wore concurrently). Also, some disagreement would be expected as the algorithm excludes long periods with limited movement while the diary method excludes all reported monitor removals (regardless of duration or degree of movement) and days outside of the main wear period (regardless of whether the participant wore the monitor or not). Importantly, correlated errors, which overstate agreement (Rennie and Wareham 1998), are unlikely with such different sources of error for each method.
The findings may not generalise to populations not tested (children, adolescents, extremely elderly participants) or with limited inclusion in our study (shift workers, mobility impaired). Without rules to identify brief removals, the algorithm would not be recommended for studies using easy-removal attachment methods (e.g. pouches or PAL stickies). Additional rules for short removals would be needed. Further improvements may be obtained by more complex algorithms, such as by incorporating acceleration (for short removals and to separate sleep from non-wear), or raw data approaches, which may yield hitherto unavailable behavioural classifications (lying down, actual sleep) in the activPAL data. Preliminary work in these areas shows promise .

Conclusion
A simple algorithm isolating valid waking wear time within activPAL events data generated similar (not identical) classifications to usual practice, with a much lower burden. It was robust to some variation in implementing the rules. Using the algorithm in a moderately large epidemiological dataset (n ≈ 700) suggested that for many purposes, adopting the low-burden algorithm is not likely to worsen data quality substantially relative to usual practice (a diarybased method), though the accuracy of either of these methods relative to true wake/sleep and wear/non-wear status remains to be seen.