Energy expenditure estimation during activities of daily living in middle-aged and older adults using an accelerometer integrated into a hearing aid

Background Accelerometers were traditionally worn on the hip to estimate energy expenditure (EE) during physical activity but are increasingly replaced by products worn on the wrist to enhance wear compliance, despite potential compromises in EE estimation accuracy. In the older population, where the prevalence of hearing loss is higher, a new, integrated option may arise. Thus, this study aimed to investigate the accuracy and precision of EE estimates using an accelerometer integrated into a hearing aid and compare its performance with sensors simultaneously worn on the wrist and hip. Methods Sixty middle-aged to older adults (average age 64.0 ± 8.0 years, 48% female) participated. They performed a 20-min resting energy expenditure measurement (after overnight fast) followed by a standardized breakfast and 13 different activities of daily living, 12 of them were individually selected from a set of 35 activities, ranging from sedentary and low intensity to more dynamic and physically demanding activities. Using indirect calorimetry as a reference for the metabolic equivalent of task (MET), we compared the EE estimations made using a hearing aid integrated device (Audéo) against those of a research device worn on the hip (ZurichMove) and consumer devices positioned on the wrist (Garmin and Fitbit). Class-estimated and class-known models were used to evaluate the accuracy and precision of EE estimates via Bland-Altman analyses. Results The findings reveal a mean bias and 95% limit of agreement for Audéo (class-estimated model) of −0.23 ± 3.33 METs, indicating a slight advantage over wrist-worn consumer devices (Garmin: −0.64 ± 3.53 METs and Fitbit: −0.67 ± 3.40 METs). Class-know models reveal a comparable performance between Audéo (−0.21 ± 2.51 METs) and ZurichMove (−0.13 ± 2.49 METs). Sub-analyses show substantial variability in accuracy for different activities and good accuracy when activities are averaged over a typical day's usage of 10 h (+61 ± 302 kcal). Discussion This study shows the potential of hearing aid-integrated accelerometers in accurately estimating EE across a wide range of activities in the target demographic, while also highlighting the necessity for ongoing optimization efforts considering precision limitations observed across both consumer and research devices.


Introduction
Engaging in regular physical activity (PA)-defined as any bodily movement produced by skeletal muscles that results in energy expenditure (EE) (1)-is associated with a lower risk for numerous chronic diseases and premature death (2).Meeting PA guidelines is sufficient to elicit health benefits, in particular in previously sedentary people.These benefits appear to increase in a dose-dependent manner (2).However, about one-quarter of the Swiss population does not meet the requirements of at least 150 min of moderate-intensity PA or 75 min of high-intensity PA per week.This proportion increases to about one-third in individuals aged 75 years or older (3).In light of strong evidence linking PA to healthy aging (4), the precise monitoring and promotion of PA among middle-aged and older adults emerge as critical strategies for personalized prevention and enhancing public health.
The use of wearable devices, which measure acceleration in either a uni-or triaxial plane, represents a promising method for such monitoring.They are easy to use, unobtrusive, and have already been shown to estimate activity and/or EE with reasonably good accuracy in healthy adult populations (5, 6) as well as in patients with chronic disease (7)(8)(9)(10).In addition, in a recent systematic review and meta-analysis, activity trackers have been shown as effective in promoting an increase in PA and reducing sedentary time in older adults (11).While this underscores the potential of these devices to estimate EE and promote PA in the middle-aged and older population, there is still an ongoing debate regarding the most appropriate sensor location.
Originally, accelerometers were worn on the hip, but they are now increasingly worn on the wrist.For example, the National Health and Nutrition Examination Survey initially relied on sensors worn on the hip to capture PA and sedentary behavior (12) and later switched to wrist-worn accelerometers (13).The main arguments typically raised in favor of wrist placement are continuous wearability, enabling sleep monitoring, and improved wear compliance (14).This was demonstrated by Huberty et al. (15), who reported that 24 h monitoring over seven consecutive days in middle-aged women is significantly more effective with wrist-worn sensors, achieving seven valid days of data for 95% of participants, compared to just 62% with hip-worn sensors.However, wrist-worn accelerometers generally provide less accurate EE estimates than hip-worn sensors (16)(17)(18).This supports the consensus that sensors positioned closer to the body's center of mass are more precise than those located more distally, such as on the arm or wrist.Although wrist sensors can accurately estimate EE during locomotion activities (19), disparities might emerge during activities with restricted arm movement (e.g., walking with a stroller) or during activities of daily living (ADL) involving a lot of arm movement (e.g., playing cards).Given that time spent in ADL increases with age (20), this disparity may become more pronounced in older populations.Indeed, Guediri et al. (21) found a greater discrepancy in EE estimates between hip and wrist-worn sensors in older compared to younger subjects under free-living conditions.This highlights the need for novel approaches to accurately capture EE in middle-aged and older populations, addressing the limitations of current methods, including the lower accuracy of wrist-worn sensors and the lower compliance associated with hip-worn sensors.
A promising strategy might involve the use of an accelerometer within a device that users already wear for extended periods throughout the day and that is situated sufficiently close to the body's center of mass to reflect the wearer's movements.Hearing aids might present such a viable option.With an average usage of 10 h per day (22), these devices are commonly worn by an increasing portion of the middle-aged and older population, a group particularly vulnerable to hearing loss.For instance, among US adults aged between 45 and 64, about 3% of men and 2% of women use hearing aids.For those aged 65 and older, the rate rises to 14% (23).This proportion would increase even further if more adults affected by hearing loss would use hearing aids.In a nationally representative sample of older US adults, 65% of adults aged 71 years and older had at least a mild hearing loss, but only 29% of them used hearing aids (24).In addition, hearing loss is associated with less PA in adults aged 60-69 years (25).Thus, monitoring and promoting PA within this population can yield significant health benefits.
Previous studies evaluating accelerometers worn around the ear (26-28) have shown promise in estimating EE but were limited to younger adults and a narrow range of activities, and none incorporated the accelerometer directly into a hearing aid.Addressing this gap and given the rising trend of wearables with health sensors (29), the current study aimed to explore whether a similar approach can effectively predict EE in a broader spectrum of ADL for middle-aged and older adults, specifically through an accelerometer integrated into a hearing aid.Thus, this research focused on adults aged 45-64 years, and those 65 years and older, comparing the accuracy of this new sensor placement with a research device worn at the hip and wrist, as well as wristworn consumer devices.We followed the features of phase I in the framework of Keadle et al. (30) and also included activities pertinent to an older population.We hypothesized that the EE prediction accuracy arising from a sensor worn at the ear would be comparable to a hip-worn sensor, yet better than wristworn devices.

Participants
Sixty middle-aged and older adults (64.0 ± 8.0 years, BMI 24.4 ± 2.8 kg • m −2 , 48% females) participated in this study (Table 1).Participants were recruited through word-of-mouth, advertisements on the university campus, and in local retirement homes.All 60 participants recruited also completed the study.
To qualify for inclusion, subjects were required to be 45 years or older, in good health, and not taking any medication (except for some allowable health conditions and medications for participants aged 65 years and above-refer to exclusion criteria).Additional requirements included a BMI greater than 18.5 kg • m −2 , the ability to perform all activities outlined in the study protocol, normal hearing or at max. mild hearing loss, and willingness to comply with the study's procedural guidelines (i.e., to refrain from intense exercise 48 h before testing; to abstain from any exercise 24 h before testing; to ensure a minimum of 7 h of sleep on the two nights before testing; to avoid alcohol on the evening before testing, as well as on the day of testing; to not consume any caffeinated food or beverages before testing on the day of the experimental visit; and to arrive fasted after at least 10 h without food intake for visit 2).Subjects were excluded if they had a history of heart, cardiovascular, metabolic, or neurological disease (incl.seizures and cognitive impairment), a biomechanical dysfunction affecting the ability to perform all activities, ear canal pathologies, moderate or severe hearing loss, an existing implanted medical device that may interfere with data collection, skin allergies or sensitivity to materials or devices used in the experiments, andspecifically for the participants aged 65 years and abovemedication influencing heart rate or EE and neurological, orthopedic, rheumatologic, or metabolic disorders influencing upper or lower limb function.
We aimed to recruit 30 adults aged between 45 and 64 years and 30 participants aged 65 years or older, with each group containing a BMI distribution encompassing normal weight, overweight, and obese categories and males and females in equal numbers (≥40% for each sex).The study was approved by the local ethics committee of ETH Zurich (EK 2022-N-44).Every participant gave written informed consent in accordance with the Declaration of Helsinki before participating in the experiment.

Overview
The study involved two separate visits to the Exercise Physiology Lab at ETH Zurich.The first visit, lasting approximately 2 h, aimed to verify the inclusion and exclusion criteria, and to familiarize the participants with the measuring devices and study procedures.The second visit, lasting between 6 and 8 h, constituted the experimental phase of the study and occurred no sooner than 48 h after the initial visit.Data collection took place between September 2022 and August 2023.

Visit 1 2.2.2.1 Informed consent and questionnaires
Upon arrival at the laboratory, study details were explained to the participants, followed by the collection of their informed consent.Demographic information, dietary habits, and lifestyle details were then assessed through three distinct questionnaires: A health screening questionnaire to assess cardiovascular and respiratory system health/risk, the International Physical Activity Questionnaire-Short Form (31), and a Daily Questionnaire to monitor participants' intake of caffeine, food, and medication, as well as their sleeping patterns and sporting activities over the past 2 days.Participants were then introduced to the study's equipment and procedures.This involved fitting them with the sensors and the facemask that is part of the portable ergospirometric system.Finally, the maximum achievable walking speeds (on both a flat surface and with a 10% incline) and running speeds (on a flat surface) on a treadmill were established for each participant.

Visit 2
Figure 1 illustrates an overview of visit 2. All activities, numbered 01-36 (refer to Table 2), were video recorded using a smartphone and subsequently downloaded onto a hard disk.To ensure optimal data quality, participants were instructed to refrain from speaking during all activities.

Daily questionnaire, anthropometrics, and sensor mounting
Upon arrival at the laboratory, the participants' adherence to the study protocol was confirmed through the daily questionnaire.Then, measurements of weight, height, and body composition were taken as detailed in Section 2.3.3.The acceleration sensors were then positioned on the participants as detailed in Figure 2 and Section 2.3.1, ensuring they were securely fastened to avoid any displacement during the activities.Special care was taken to fit the facemask airtight on the face before starting the ergospirometric device.To ensure comfort of the participants, they were asked about their comfort in each break, and-in case of need-a respective strap was loosened during the break or the face mask (ergospirometric device) was removed and/or slight adjustments were made to the strapping, making sure not to change the position of any sensor.After completing the preparatory phase, the measurement of resting energy expenditure (REE) was conducted for 20 min (activity 01, see Table 2).During this time, participants were positioned supine on a stretcher.They were instructed to remain still and quiet.After the REE measurement, still lying supine on the stretcher, blood pressure was measured on the left upper arm, as detailed in Section 2.3.4.Immediately after this, pulse wave velocity was assessed, following the procedure described in Section 2.3.5.

Breakfast and resting period
Participants then proceeded to a table where a standardized breakfast was provided.Subjects weighing less than 90 kg received a meal comprising 70 g of bread, 10 g of butter, 28 g of marmalade, 250 ml of orange juice, and water ad libitum, totaling 450 kcal.For those weighing 90 kg or more, the breakfast composition was identical, except for an increased bread portion of 105 g, totaling 550 kcal.All subjects were instructed to consume the entire meal at their preferred pace.After breakfast, participants rested for 30 min to minimize the impact of dietinduced thermogenesis on subsequent activities.

Physical activities
Next, each participant engaged in 13 different activities, 12 of them chosen from a pool of 35 activities (Table 2); the 13th activity was performed by everyone as the final task (activity 36).During all activities, energy expenditure (ergospirometric device) and acceleration signals were continuously recorded (see Section 2.3).
Activities were categorized into six categories: sedentary & lying, low-intensity activities, activities with varying intensity or not involving physical movement, indoor locomotion-related activities, outdoor activities, and activities requiring aids.For data processing and analyses, the activities were classified as detailed in Section 2.4.4.To ensure a balanced distribution, activities were pseudo-randomly allocated to participants.Each activity had a duration of 8 min, with two exceptions: cycling on the ergometer, which was performed for 6 min, and stair climbing and descending, taking approximately 1.5 min, varying with the participant's chosen walking speed.Before each activity, there was a 2 min sitting period on a chair.In between activities, a standardized break of at least 4 min was provided.After the last activity, all devices were removed, and data was downloaded locally on a computer.
Garmin and Fitbit were attached to the wrist of the dominant arm, while ZurichMove sensors were placed on the hip and both wrists.The hip sensor was fixed via an elastic band with a silicon inlay around the body on hip level, placed directly on the skin.This way, the sensor located on the level of the iliac crest on the lateral edge.
Successful data transmission during the test was ensured either by visual inspection of the real-time data feed (ergospirometric device, Audéo, Garmin, Fitbit) or by visually checking the status indicator on ZurichMove.Audéo data was transferred and stored in real-time to a smartphone app during the activities.Garmin and Fitbit both recorded on the device and data was transferred to a smartphone app at the end of all experimental recordings of the day.ZurichMove recorded on the sensor itself and the data was transferred to a laptop via a docking station at the end of all experimental recordings of the day.Data from the ergospirometric device was transferred in real-time to a laptop with a dedicated software.

Indirect calorimetry
A portable, battery-operated, ergospirometric device (Oxycon Mobile, Vyair Medical, Höchberg, Germany [53 participants] or ), transmitting the data telemetrically to a computer.For technical reasons, the Oxyconsystem needed to be replaced shortly before the end of the 1-year duration of data acquisition period (percentage of participants using the Metamax-system was similar between the calibration and validation groups, see Section 2.4.3).Each system was mounted on a harness worn on the back.Before each test, a calibration was carried out according to the manufacturer's instructions, which consisted of (1) recording of the ambient conditions, (2) calibration of the flow sensor using a 3-liter calibration syringe and (3) calibration of the O 2 and CO 2 sensors using a gas cylinder with known gas concentrations (5% CO 2 , 16% O 2 ).

Body composition measurements
Height and weight measurements were taken using a stadiometer and an Omron BF511 digital scale (Omron, Kyoto, Japan).Segmental fat and lean body mass proportions, relative to the total body mass, were determined using a calibrated Lunar iDXA densitometer (GE Healthcare, Madison, WI, USA).The data were analyzed following the manufacturer's guidelines, with automatic processing conducted by the device's proprietary software.

Blood pressure
Systolic and diastolic blood pressure was measured on the left upper arm using a cuff and an automated blood pressure monitor (Metronik BL-6, Metronik, Aue, Germany).Before the measurement, subjects were lying comfortably for at least 20 min (see REE measurement, Section 2.2.3).At least 3 valid measurements were taken with a 1 min break in between the measurements.The average of 3 valid measurements was taken as the final value.

Pulse wave velocity
Pulse wave velocity was measured using two piezoelectric pressure sensors (placed manually on the carotid and femoral artery via palpation) sampling at 1kHz and attached to an acquisition unit (Complior, Alam Medical, Saint Quentin Fallavier, France).Before the measurement, subjects were lying comfortably for at least 20 min (see REE measurement, Section 2.2.3).At least 3 valid measurements were taken.The average of 3 valid measurements was taken as the final value.

Data processing and analysis 2.4.1 Data preprocessing
All raw data were preprocessed using MATLAB R2023a (The MathWorks Inc., Natick, Massachusetts, USA).The primary objective of this preprocessing was to systematically reorganize the data for each participant, categorizing it by activities.Prior to each testing session, clocks of all devices were synchronized.Timestamps from the Audéo hearing aid served as the reference for timing of the activities, and the data from all other devices were adjusted to align with this time.

Processing
The reference EE was calculated from _ VO 2 and _ VCO 2 values according to Weir (32) using the in-built formula of the metabolic device (Oxycon Mobile) and applied to both systems: with UN = 15 g • day −1 .Outlier removal for ventilation and gasexchange variables involved a two-step process: conservative removal of non-physiological values and deletion of values that deviated more than two standard deviations from the local 30 s mean.
In order to estimate EE from acceleration data, the calculation of "acceleration counts" (or simply "counts") was required (see Section 2.4.4).These counts were obtained using a simplified version of the method developed by Actigraph (33).In order to capture signal components related to slow and fast movements (e.g., slow walking and running) while removing high-frequency noise (e.g., vibrations), the x, y, and z acceleration data for the entirety of the signal were bandpass filtered [lower cutoff frequency = 1 Hz, upper cutoff frequency = 12.5 Hz, based on (34)], the squared magnitude was computed as x 2 + y 2 + z 2 , and 1 min epochs were approximated by applying a first order low pass filter with 1 min time constant.
Ventilation and gas-exchange variables, and accelerometer counts were averaged over the last 4 min for all activities except climbing stairs, descending stairs, and the three stages of cycling on the ergometer, for which the last 20 s were used.This was done to ensure that participants reached a steady-state V ̇O2 .The Overview of measurement devices and their location on the body.

Calibration and validation groups
Subjects were quasi-randomly split into a calibration and validation group using an inbuilt randomization function in MATLAB.Conditions were set to allocate 44 subjects to the calibration group (73%; 22 middle-aged and 22 older) and 16 to the validation group (27%; 8 middle-aged and 8 older).The calibration group served to develop the EE estimation models, while the validation group was used to run statistical analyses.

EE estimation
To account for real-world constraints (embedded device with limited computational power, energy storage, and memory), a low-complexity approach was used for the model implementation.The EE per activity p was estimated by linearly mapping (36) the acceleration counts to MET as: where the slope a k and intercept q k per activity class were determined via linear regression from the counts and reference MET (from ergospirometric device) of all participants in the calibration group.The performed activities were mapped to activity classes based on their expected similarities in their accelerometer signals and MET ranges.Activity classes were either estimated based on the accelerometer signal (classestimated approach) or manually assigned to one of the following classes: LaySit, Sedentary, ADL, Stationary, WalkFlat, WalkUp, WalkDown, Run, or Aid (class-known approach).The class-estimated approach was used to determine the performance of the Audéo EE estimation model and to compare it to the internal EE estimates from Fitbit and Garmin.The class-known approach was used to compare Audéo with ZurichMove sensors.The EE for participant i during activity p was finally obtained as: where BMR i is the participant's basal metabolic rate, according to the Müller equation (37): with sex = 0 for females and 1 for males, weight in kg, age in years, and 239 being the conversion factor from MJ to kcal.The selection of the formula was based on its satisfactory performance in estimating BMR within our study sample (mean absolute bias = 13.3%).
In order to contextualize accuracy, rather than presenting it solely in terms of MET errors, an estimation of the EE over a standardized whole day of hearing aid use (10 h) (22) was calculated.First, based on reference METs, activities were classified into the four intensity levels "very light", "light", "moderate", and "vigorous" according to the intensity criteria described by Garber et al. (38) for older adults aged ≥65 years.Note that the levels "vigorous" and "near maximal to maximal" were combined in order to be compatible with the estimates used for the proportion of time spent in different activity classes.Daily kcal counts for individual participants were then calculated for, and summed over, the four different intensity classes according to the Formula: Where EE i represents the estimated total kcal participant i burned over 10 h, MET i,k the median estimated MET of all activities within intensity class k for participant i, BMR i the estimated BMR of participant i, 0.416 the proportion of 10 h relative to 24 h, and P k the proportion of hearing aid use time that older adults spend in intensity level k.These proportions were approximated as: 73% very low, 17% low, 9% moderate and 1% vigorous intensity based on data of 18,000 Sonova hearing aid users.These approximations appear broadly consistent with other literature (25).

Statistical analysis
The performance of the EE estimation models in the validation group was analyzed using Bland-Altman plots (39) comparing estimated and reference METs and calculating mean bias, 95% limits of agreement (LoA), and mean absolute errors (MAE).Analyses were performed for all activities except eating breakfast and postprandial resting measurements.The class-estimated model was used to quantify the Audéo performance and to compare it to Fitbit and Garmin.The class-known model was used to compare Audéo with ZurichMove sensors.Two-sided independent t-tests were used to test whether mean biases differed significantly from zero.
Bland-Altman analyses were also performed separately for separate activities, intensity classes, and age groups in the validation group.For the total daily EE, a Bland-Altman analysis was performed over the whole dataset.To assess whether Audéo was able to accurately detect a within-subject change in intensity, the change in MET between flat walking (activities 19 and 20) and flat running (activities 23 and 24) was compared with the reference method using regression analysis.Two-sided independent t-tests, Pearson Chi-Square and Fischer's exact test were used to compare demographic and anthropometric data between the two age groups and between the calibration and validation groups.All analyses were performed using MATLAB R2023a.Significance was set as p < 0.05.

Energy cost of physical activities
Figure 3 shows the reference EE, as measured by the ergospirometric device, in METs for all activities.The median METs for most activities fell within the expected range (i.e., light intensity), except for some household activities, such as vacuum cleaning, cleaning with a mop, dust wiping, and hanging laundry, which were categorized into the moderate intensity category.

EE estimation
Mean bias for the Audéo class-estimated model was −0.23 METs (see Figure 4), differing significantly from zero (p = 0.031).Lower and upper LoA were −3.56 and 3.10 METs, respectively.MAE amounted to 1.19 METs.
The performance metrics for Audéo (class-estimated) vs. Fitbit vs. Garmin are shown in Table 3 and Figure 5.Note that for this comparison, two participants had to be excluded from the analysis as their Fitbit data could not be downloaded.To facilitate the comparison to the 6 min data from Fitbit and Garmin, the three stages of activity 18 (Cycle_ergo) were averaged into a single value.
Performance metrics for Audéo vs. ZurichMove (both classknown) are shown in Table 4.Note that with the approach used in this study, it was not possible to build an EE estimation model for the wrist sensors because of the missing linear relationship between counts and METs in the wrist data for the majority of the activity classes.Metabolic equivalent of task (MET) by activity measured by the reference method.Shown are individual means (dots) and group medians (lines) for each activity.Refer to Table 2 for a description of the activities.The dotted lines show the thresholds for sedentary behavior (MET < 1.5), light intensity (1.5 ≤ MET < 3.0), moderate intensity (3.0 ≤ MET < 6.0), and vigorous-intensity physical activity (MET ≥ 6.0).ADL, activities of daily living; 50/75, 50%/75% of max.walking or running speed; inc10, 10% inclination.

Average error over a day
The total daily caloric estimation during wake-time (10 h) using reference EE values was 1,139 kcal.Mean bias for Audéo was +61 kcal (see Figure 7), with lower and upper LoA being −241 kcal and +363 kcal.MAE amounted to 131 kcal.Accuracy and precision for Fitbit (mean bias −111 kcal and lower and upper LoA −435 and +213 kcal, respectively) were comparable to Audéo.Compared to Audéo and Fitbit, Garmin showed similar accuracy but lower precision (mean bias 136 kcal and lower and upper LoA −716 and +989 kcal, respectively).

Age subgroups
The EE estimation model performed slightly better in the older subgroup compared to the younger subgroup, as indicated by a lower mean bias (−0.13 METs in older vs. −0.33METs in middle-aged participants), narrower LoA (−3.27 to +3.01 METs vs. −3.84 to +3.18 METs), and lower MAE (1.18 METs vs. 1.20 METs, p = 0.896).

Within-subject change in intensity
The within-subject changes in METs from walking flat to running flat for Audéo and the reference were positively correlated (R = 0.566, p < 0.001) (see Figure 8).The mean change in METs for Audéo amounted to 2.63 METs, as opposed to the reference which had a mean change of 3.99 METs.Mean bias of change was −1.36 METs, with lower and upper LoA being −4.42 and 1.69 METs.MAE of change was 1.65 METs.

Discussion
This study aimed to predict EE in middle-aged and older participants in a broad spectrum of ADL (including activities with aids) using an accelerometer integrated into a hearing aid and comparing it at the same time to other research and consumer devices located on the wrist and hip.Bland-Altman analyses show good overall accuracy (low mean bias; Audéo vs. Reference: 0.23 METs) but low precision (wide LoAs; Audéo vs. Reference: ± 3.33 METs).Performance was slightly superior to  wrist-worn consumer accelerometers and equivalent to a research accelerometer placed on the hip when the same modeling approach was used.

EE estimation
Across all activities, the Audéo prediction model (classestimated) demonstrated a minor underestimation of 0.23 METs, along with a wide LoA of ± 3.33 METs.These findings align with other studies that used ear-level accelerometers to predict EE.For example, Atallah et al. (26) used an ear-worn inertia sensor during 11 ADLs (lying, standing, computer work, vacuuming, stairs, walking, running, cycling, and rowing) in 25 healthy young subjects.This resulted in an overall mean bias of 0.04 METs and a LoA of ± 3.65 METs.Similarly, Bouarfa et al. (27) developed an EE prediction model using 25 young subjects involving 10 ADLs (same as in Atallah et al., but without rowing).They reported a mean absolute deviation below 1.2 METs, i.e., identical to the MAE found in this study.
Good accuracy and low precision are typically also observed in other studies that use research or consumer devices to predict EE.For example, Crouter et al. (40) investigated the performance of the Actigraph and Actical devices (both worn on the waist), and the AMP-331 monitor (worn on the ankle) during 18 different leisure and sporting activities (e.g., lying, computer work, vacuuming, walking, running, stairs, basketball) to predict MET in 48 younger to middle-aged adults.The Bland-Altman plots show mean biases of about −0.5, −1.0, and −2.5 METs for Actigraph, Actical, and AMP-331, respectively, and LoAs of about ± 3.5, ± 3.0, and ± 4.0 METs.This data also aligns with our finding that sensors located closer to the body's center of mass (including the ear), tend to outperform devices positioned on the limbs when the goal is to predict EE in a wide range of ADL.Literature supports this observation, indicating that wrist-worn accelerometers generally yield less accurate EE estimates compared to those worn on the hip (16)(17)(18).In theory, this difference could be attributable to different device grades, as research devices are typically being worn on the hip or chest, while consumer devices are worn on the wrist.0.51 ± 2.11 METs, respectively.However, this finding is more likely confounded by the sensor location: while the Actiheart was worn on the chest, all other sensors were worn on the wrist.Indeed, when consumer and research devices are worn at the same location, differences tend to disappear.For example, a metaanalysis investigating the accuracy of wrist-worn devices found no significant overall differences between research and consumer devices in estimating EE (6).This finding is consistent with the notion that the latest generation of consumer devices incorporates similar technology to that of established research devices (41).Our findings thus contribute to the existing body of literature by demonstrating that an accelerometer integrated into a hearing aid performs comparably to a research device placed on the hip, despite exhibiting a broad LoA-a characteristic consistent with other studies in the field.
Our observation that the Audéo model marginally surpasses Fitbit and Garmin in performance should be interpreted with caution.We developed and validated our model with the assumption that 1 MET = V ̇O2 measured individually at rest.Reanalyzing the data using 1 MET = 3.5 ml • min −1 • kg −1 , mean biases for Fitbit and Garmin change from about −0.6 METs to about −0.1 METs.Precision also improves but remains slightly inferior to Audéo.More importantly, the calibration and validation groups are very similar in this study.This represents a possible disadvantage as Fitbit and Garmin were likely calibrated in a population that is not as comparable to the one used in this study.For example, EE estimation in an older population is compromised when a model is trained on younger subjects and improves substantially when the model is trained on older participants (unpublished data).This study shows that the EE estimation is slightly better in the older subgroup and that training the model on the older subgroup leads to better performance when evaluated in the older subgroup than when the model is trained on the middle-aged subgroup (data not shown).Due to age-related physiological and functional changes, e.g., changes in speed of movement, gait mechanics, and body composition (42)(43)(44), algorithms validated in younger adults may not accurately apply to older age groups (18).Whether Audéo outperforms Garmin and Fitbit (or other consumer sensors) in an independent sample in a laboratory setting or under free-living conditions, needs to be tested according to existing validation frameworks [e.g., (30)].

Precision of EE estimates
The broad LoA observed in this study for Audéo (and other sensors) can be attributed to the underestimation and overestimation of EE for distinct activities.Specifically, EE associated with ascending activities (e.g., climbing stairs, walking uphill) and stationary activities (e.g., cycling on an ergometer, squatting) was commonly underestimated; whereas EE for descending activities was overestimated, as illustrated in Figure 6 and Supplementary Figure S1.This phenomenon is likely attributable to the disparity between accelerometer counts and actual EE for certain activities-namely, those involving low accelerations with high EE (e.g., cycling) and those with high accelerations but low intensity (e.g., descending stairs).variability in accuracy for single activities has also been demonstrated in reviews on the topic (6,45) and is a known limitation when using accelerometers to predict EE.When comparing different Actigraph equations, Crouter et al. (40), for example, concluded that no single equation is valid for the EE estimation of all activities and that equations work best only in the activity subgroup they were developed.In this study, mean bias and LoA increased with activity intensity, a trend likely attributable to the nature of intense activities-predominantly ascending and stationary-which are associated with the largest errors.
In the model employed in this study, we operated under the assumption of a linear relationship between counts and METs within the different activity classes.Although the adoption of this two-step approach-initial classification followed by application of a class-specific model-is acknowledged to enhance estimation performance (46), it is not without its challenges.For example, for some activity classes, there were no or only weak correlations between counts and METs, rendering EE prediction challenging.Similarly, in a meta-analysis on the validity of the Actigraph device (worn either on the hip or wrist) for measuring EE in healthy adults, Wu et al. (45) found no correlation between activity counts and EE for some activities, e.g., during cycling, standing, walking at a moderate speed, and fast running (47).Because of the missing linear relationship between counts and METs in the wrist data within most of the activity classes in this study, we refrained from developing an EE-estimation model for ZurichMove sensors on the wrist.It is important to note that our methodology was specifically tailored to enhance the EE prediction for a sensor integrated into a hearing aid with limited memory and computational power, suggesting that different strategies might have been effective for wrist-worn sensors or devices with more computational power.

EE estimation over a day
Despite this variability in accuracy for single activities, the errors might cancel out under the assumption that a wide range of activities are performed over a day.Indeed, the calculated mean bias for Audéo for a 10 h wear time amounted to an overestimation of 61 kcal (∼5% of 10h-EE).Similarly, Härtel et al. (48), investigating the kmsMove sensor (worn on the hip) during rehabilitation activities over 7 h in 7 middle-aged adults, found an average underestimation of 14 kcal.Berntsen et al. (49), using Actigraph (worn on the hip), (chest and thigh), and ikcal (chest) monitors during free-living lifestyle and working activities over 2 h in 20 younger and middle-aged adults, found mean biases ranging from −34 to −111 kcal.Despite good overall accuracy, drawing inferences for individual subjects remains challenging, as evidenced by our data (LoA 302 kcal) and Härtel's and Bernsten's findings (LoA ranging from 261 to 397 kcal).Nonetheless, it can be argued that the level of accuracy and precision shown by Audéo, and other devices is acceptable in the context of health interventions.For example, in order to achieve weight loss, energy intake should be reduced by about 500-1,000 kcal a day (50).This change is higher than the reported LoA of this study, hinting at the potential for detecting such changes.Furthermore, the observed significant within-subject correlation between an increase in intensity from walking to running, as measured by both the ergospirometric device and the Audéo sensor, underscores the device's ability to detect MET changes.However, this detection is relative rather than absolute, as indicated by the regression line's deviation from the identity line.This deviation from the identity line is the result of an overestimation of METs during level walking activities.

Future directions
Our findings from a sensor located at ear-level open the door for other application fields in this age range, e.g., IMU integration in in-ear headphones worn during sporting activities (29).Accuracy and precision for the Audéo sensor, and in general for accelerometers aiming to predict EE in a variety of activities (including ADLs, sporting activities), can be improved by the incorporation of heart rate (HR) data, as evidenced by O'Driscoll et al.'s (6) meta-analysis.This is particularly relevant for activities exhibiting a disparity between accelerometer counts and actual EE.It would be interesting to explore whether an earworn device that can detect HR via photoplethysmography can provide better performance.In addition, incorporating an altimeter has the potential to improve accuracy given that the largest errors in this study were found for ascending and descending activities.This is supported by Duncan et al. (51) who found that using barometers and global positioning systems improved EE estimation accuracy during field-based activities, compared to accelerometry alone, by 11%.Frontiers in Digital Health 13 frontiersin.org

Limitations
This study was performed in a laboratory setting, following the framework proposed by Keadle and colleagues (30).However, it is known that the accuracy of models validated under laboratory conditions decreases when applied in free-living conditions (21).Also, the modeling approach adopted in this study did not enable the development of an EE estimation model for the ZurichMove sensors positioned on the wrists, thereby precluding a comparative analysis of their performance with the Audéo sensor.

Conclusion
This study demonstrates that an accelerometer integrated into a hearing aid (Audéo) can accurately estimate EE across a broad range of ADL in a middle-aged to older population.However, the precision of these estimates is limited, making personal-level inferences challenging, though still offering valuable insights on a population level.Moreover, the Audéo sensor's performance in EE prediction, using the same modeling approach, matched that of a research device worn on the hip and slightly outperformed two wrist-worn consumer monitors.This indicates that an accelerometer integrated into a hearing aid can serve as an equivalent alternative for monitoring physical activity.This opens the door to unobtrusive evaluation of energy expenditure during daily life in older individuals that are already using hearing aids, and eventually to implementation of personalized interventions promoting healthier aging.affiliated organizations, or those of the publisher, the editors and the reviewers.Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

FIGURE 1
FIGURE 1Overview of the experimental visit (visit 2).See text for details.EE, energy expenditure measurement.

Figure 6
Figure 6 shows the mean bias and LoA of the Audéo EE estimation for the different activity classes mentioned in

FIGURE 4 Bland-
FIGURE 4Bland-Altman plot for Audéo (class-estimated).The solid red line shows the mean bias, while the dotted lines indicate the upper and lower 95% limits of agreement.Colored dots show different activity classes.ADL, activities of daily living; MET, metabolic equivalent of task.
For example, Chowdhury et al. (41), comparing consumer monitors (Microsoft Band, Apple Watch, and Fitbit Charge HR) with a research device (Actiheart) during 9 ADL in 30 young subjects, concluded that consumer devices are not yet at the level of the best research devices: Mean bias and 95% LoA values for consumer monitors amounted to −0.55 ± 3.65 METs while for Actiheart they were

FIGURE 5
FIGURE 5 Bland-Altman plots for Audéo (class-estimated), Fitbit and Garmin.The solid red line shows the mean bias, while the dotted lines indicate the upper and lower 95% limits of agreement.Colored dots show different activity classes.ADL, activities of daily living; MET, metabolic equivalent of task.

FIGURE 6
FIGURE 6Mean bias and limits of agreement (LoA) by activity class for Audéo.ADL, activities of daily living; MET, metabolic equivalent of task; N, number of subjects (validation group).

FIGURE 7
FIGURE 7 Daily caloric estimation: comparison between Audéo and reference.Empty circles show individual data, filled circles show mean values.Energy expenditure was summed over activities of very low intensity (vLow), low intensity (Low), moderate intensity (Mod), and vigorous intensity (Vig).EE, energy expenditure.

FIGURE 8 Within
FIGURE 8    Within-subject correlation between the change in metabolic equivalent of task (MET) for Audéo and reference between walking and running.The dashed black line shows the identity line, while the dashed red line is the regression line.

TABLE 2
Overview of physical activities.
Metamax 3B, Cortex, Leipzig, Germany [7 participants]) served as a basis for the calculation of reference EE.It recorded, on a breathby-breath basis, parameters related to ventilation and gas exchange (O 2 uptake, _ VO 2 ; CO 2 output, _ VCO 2
MET, metabolic equivalent of task; LoA, limit of agreement; MAE, mean absolute error, p-value, two-sided independent t-test (mean bias different from zero).

TABLE 5
Performance metrics by intensity level for the Audéo classestimated model.