Measuring Arterial Oxygen Saturation Using Wearable Devices Under Varying Conditions

INTRODUCTION: recently developed wearable monitoring devices can provide arterial oxygen saturation (S p o 2 ) measurements, offering potential for use in aerospace operations. Pilots and passengers are already using these technologies, but their performance has not yet been established under conditions experienced in the flight environment such as environmental hypoxia and concurrent body motion. METHODS: an initial evaluation was conducted in 10 healthy subjects who were studied in a normobaric chamber during normoxia and at a simulated altitude of 15,000 ft (4572 m; 11.8% oxygen). S p o 2 was measured simultaneously using a standard pulse oximeter and four wearable devices: apple Watch Series 6; Garmin Fēnix 6 watch; Cosinuss o two in-ear sensor; and oxitone 1000m wrist-worn pulse oximeter. measurements were made while stationary at rest, during very slight body motion (induced by very low intensity cycling at 30 W on an ergometer), and during moderate body motion (induced by moderate intensity cycling at 150 W). RESULTS: missed readings, defined as failure to record an S p o 2 value within 1 min, occurred commonly with all wearables. Even with only very slight body motion, most devices missed most readings (range of 12–82% missed readings) and the rate was higher with greater body motion (range 18–92%). one device tended to under-report S p o 2 , while the other devices tended to over-report S p o 2 . Performance decreased across the devices when oxygenation was reduced. DISCUSSION: in this preliminary evaluation, the wearable devices studied did not perform to the same standard as a traditional pulse oximeter. these limitations may restrict their utility in flight and require further investigation.

has been particularly topical in the setting of military fast-jet operations due to the possible contribution of hypoxia to unexplained physiological events.However, it is important to establish the performance of new technologies prior to safety-critical use.With regards to isolated S p o 2 monitoring during flight, additional care is required as interpretation can be challenging or misleading even for accurate measurements; for example, in the presence of hyperventilation. 2 While the accuracy of heart rate data from wearables has been well-reported, the ability to measure S p o 2 is a newer feature and has not been comprehensively investigated. 8Standard pulse oximeters used in medical practice utilize transmissive photoplethysmography (PPG), in which a light source and photodetector are located on opposite sides of a vascular bed (such as a finger or ear lobe) and the intensity of transmitted light of certain wavelengths is measured.The reliability of this technique is well established, but such devices tend to be somewhat obtrusive when used while performing other activities.In contrast, wearables are by their nature less obtrusive, but typically use the less established technique of reflective PPG, in which the light source and photodetector are positioned on the same side of a vascular bed and the intensity of reflected light is measured. 12Wearables are also designed for S p o 2 measurements to be made while completely stationary.
Recently developed wearables that can measure S p o 2 include consumer-grade products such as the Apple Watch 6 (Apple Inc., Cupertino, CA, USA) and Garmin Fēnix 6 watch (Garmin Ltd, Olathe, KS, USA), which are marketed for 'general fitness and wellness purposes' rather than for medical use.In contrast, the commercially available in-ear ('hearable') Cosinuss o Two (Cosinuss GmbH, Munich, Germany) has undergone testing in clinical settings, although comparative data has not been published and it is not currently classified as a medical device, while the wrist-worn Oxitone 1000M (Oxitone Medical, Kfar Saba, Israel) is an FDA-cleared medical monitor intended for clinical use.The Garmin and Apple watches and Cosinuss o Two use reflective PPG, while the Oxitone 1000M uses transmissive PPG.There is little published research reporting S p o 2 data from these devices.The Oxitone 1000M has been reported to provide accurate and precise S p o 2 values when measured in a stationary state, 5 while a recent study conducted in a respiratory outpatient clinic reported the Apple Watch 6 appeared to be a reliable means of measuring S p o 2 in this controlled setting, although there were occasional outlying values. 10An earlier Garmin watch model (the Fēnix 5× Plus) was found to over-estimate S p o 2 in volunteers studied in a normobaric chamber, especially at higher simulated altitudes, and it was noted that achieving a single measurement could take up to 3 min. 6This highlights the potential for measurement failure to impact on performanceirrespective of its other qualities, a device that is unable to reliably achieve a timely reading is unlikely to be useful in the flight environment.
Although there is limited data and satisfactory performance cannot be assumed across the various technologies, these initial studies are generally encouraging with regards to use while stationary and under normoxic conditions.However, in-flight use does not necessarily allow such optimal conditions; achieving an absolutely motionless state can be challenging or impossible, and a lower range of S p o 2 may well be encountered.To our knowledge, no previous studies have investigated the potential combined effects of hypoxia and concurrent body motion of any degree.This initial study aimed to undertake a preliminary evaluation of four leading wearable devices in measuring S p o 2 under normoxic and hypoxic conditions while at rest and during relevant levels of body motion, including very minimal movement only marginally beyond a stationary state.The hypothesis was that their performance in measuring S p o 2 would be the same as that of a standard pulse oximeter.Our aim was to generate preliminary results and provide a basis for the definitive studies that are ultimately required.

Subjects
This study was conducted in healthy volunteers and was approved by the King's College London Research Ethics Committee.It was conducted in accordance with the Declaration of Helsinki.All subjects provided written informed consent.

Equipment
The study was undertaken in a normobaric altitude chamber (Sporting Edge, Basingstoke, UK) containing a cycle ergometer (Monark 818E, Monark Exercise, Vansbro, Sweden).Reference S p o 2 was measured continuously at the left index finger using a standard pulse oximeter (Pulse Oximeter 7840, Kontron Instruments Ltd, West Sussex, UK) recorded via PowerLab 8/35 and LabChart 8.0 (AD Instruments, Oxford, UK) and was compared with data from an Apple Watch 6 (at the left wrist), Garmin Fēnix 6 watch, and Oxitone 1000M (at the right wrist) and a Cosinuss o Two (in the right ear).All wearables were attached and operated according to the manufacturer's instructions, and the Cosinuss o Two was fitted for size (small, medium, or large).Simultaneous heart rate measurements were recorded from all monitors in parallel with S p o 2 .

Procedure
Subjects attended the laboratory on 2 experimental days separated by a minimum of 24 h.The protocol was identical on each occasion except one day was conducted under normoxic conditions in room air (20.9% oxygen) and the other was conducted in hypoxic conditions at a simulated altitude of 15,000 ft (4572 m; 11.8% oxygen).This altitude was intended to extend nadir S p o 2 values into the 70-80% range.The order of normoxia and hypoxia was counterbalanced and subjects were blinded to each condition.Following instrumentation, subjects entered the hypoxia chamber and completed 10 min of seated rest.They then cycled on the ergometer for 5-min periods at very low intensity (30 W) and at moderate intensity (150 W) separated by 5 min of seated rest.These periods of cycling were intended as a reproducible means of inducing very slight body motion (30 W) and moderate body motion (150 W), with the added potential for exaggerating any hypoxemia. 13Participants were instructed to remain otherwise still while cycling and there was minimal associated motion of the arms and head, especially at 30 W, which requires only very gentle pedaling.A further 5 min of seated rest concluded testing.For each period of rest and cycling, measurements of S p o 2 and heart rate were recorded at three evenly spaced time points.A maximum of 1 min was allowed to obtain a reading from each device, after which a failed or 'missed' measurement was recorded.

Statistical Analysis
Data were normally distributed (Shapiro-Wilk test).The effect of hypoxia on S p o 2 and heart rate was analyzed with paired t-tests (SPSS Statistics v.26, IBM, Armonk, NY, USA) using mean data for each period of rest or cycling (using S p o 2 and heart rate data obtained from the reference pulse oximeter).The accuracy and bias of measurements from the wearable devices were tested against the reference pulse oximeter using paired t-tests, Bland Altman analyses (GraphPad Prism, v.26, San Diego, CA, USA), and mean absolute percentage error (MAPE) score.MAPE was calculated using the following equation: ((actual value − forecast value)/actual value)*100.Statistical significance was assumed at P < 0.05 and data are presented as mean ± SD.

DISCUSSION
This preliminary study of four wearable devices indicates that, across a range of S p o 2 values and levels of body motion, the ability of each of the respective devices to measure S p o 2 diverged substantially from that of a traditional pulse oximeter.A high proportion of readings were recorded as 'missed' when the device failed to provide a measurement within 1 min, which would be considered a potentially critical operational failure in many aviation contexts.Missed measurements were common even at rest for most devices and none were able to reliably provide S p o 2 measurements during cycling at moderate or even low intensity, when associated movement of the rest of the body was very minimal.The Apple Watch 6 had the highest accuracy with a potentially acceptable bias when S p o 2 values were achieved, but the device missed the majority of readings in the presence of very slight body motion, and missed nearly all readings when body motion was at a moderate level.These wearable devices are designed for S p o 2 measurements to be taken in a stationary state, but this is likely to be difficult or impossible to achieve during flight operations.Measurements were frequently missed even when there was only the slightest body motion and it is, therefore, questionable whether these devices would be able to obtain measurements reliably in many real-world settings, including aerospace environments.
The reduction in the performance of wearables in the presence of any movement of the body is attributable to motion artifact.As technology advances and becomes progressively miniaturized, this more readily exposes the PPG signal to noise such as motion artifact and movement of the PPG sensor that alters the direction in which the light signal is emitted.This is particularly pertinent when the motion artifact frequency corresponds with that of the PPG signal (0.5-5.0 Hz).Typically, motion artifact noise relates to a frequency of 0.01-10 Hz, thus regularly overlapping with the PPG band. 7 further factor to be considered is the potential for variation in peripheral circulation to affect S p o 2 measurements.Poor perfusion can cause a decrease in the ratio of arterial to venous blood at the sensor location, reduced venous saturation through a larger oxygen extraction ratio, and lower pulse amplitude.In addition, motion artifact can have a more profound impact when pulse amplitude is suppressed as it exerts a greater influence on the PPG signal. 9Poor perfusion could conceivably have lowered the S p o 2 readings of the wrist-worn wearables in this study if a redistribution of blood flow to the exercising muscles in the lower limbs occurred.However, this seems unlikely as any such effect would also have applied to the reference pulse oximeter, and we note that the Cosinuss o Two (situated in the ear) was the only device to consistently under-report S p o 2.
The performance of wearables in measuring S p o 2 has only been investigated in a small number of studies in which data was obtained at rest. 5,6,10A perfectly motionless state provides optimal conditions and may explain the more favorable comparative data obtained with the Apple Watch 6, 10 Oxitone 1000M, 5 and the predecessor Garmin Fēnix 5× Plus watch. 6he latter study also explored the effect of reducing inspired oxygen concentration and demonstrated a larger bias at a simulated altitude of 12,000 ft (3658 m) compared with lower altitudes. 6In the current study we observed a decrease in the performance of S p o 2 measurements under hypoxic conditions compared with during normoxia in all four wearable devices.Pulse oximeter performance is known to be reduced at lower S p o 2 values 11 and, in this context, the possibility that wearables may be additionally unreliable when oxygenation is lower, such as at altitude, warrants particular caution regarding their use in aerospace operations.
This study had several limitations.The sample size was intended to allow an initial preliminary evaluation of multiple wearables across varying conditions.The results are preliminary in nature and are intended to serve as the basis for more definitive research.Subjects were young and healthy and were primarily from a white ethnic background, precluding any analysis of the effect of skin pigmentation. 3Cycling does not replicate actual in-flight conditions and was used as a reproducible surrogate for relevant levels of body motion, as this is the aspect of pedaling that has the potential to impair readings from wearable devices.The protocol did not target associated metabolic activity, which is not directly related to the function of wearable monitors.It should be noted hardware and software for these technologies remain under continuing development and improvement.Furthermore, consumer grade products such as the Apple Watch 6 and Garmin Fēnix 6 carry disclaimers that S p o 2 readings are not intended for medical use and associated product information acknowledges various factors may affect measurements, including a user's individual anatomy, the fit of the device, and ambient light conditions.
Wearable technology is rapidly advancing and, with further development, the ability to measure S p o 2 unobtrusively offers great potential to be useful in a multitude of settings, including as a means of early detection of hypoxemia in clinical populations.This could encompass ambulatory and outpatient settings as well as ward-based, perioperative, and critical care medicine.Ultimately, wearable-derived S p o 2 data may likewise offer benefits as in-flight tools, whether for pilots, passengers, aeromedical patients, rear crew, or skydivers.Based on this preliminary study, we suggest further research and development is required before this can be generally recommended.Future investigations may consider ways to minimize movementassociated noise infiltrating reflective PPG signals and should encompass relevant populations and environmental conditions, including actual in-flight measurements.
In summary, while wearable devices offer great promise, in this preliminary study the four wearable devices investigated did not perform to the same standard as a traditional pulse oximeter for S p o 2 measurements.Limitations associated with varying conditions, including minimal body motion, may well apply in real-world settings, including aviation and spaceflight, and further research into the use of wearables in these domains is required.

Fig. 1 .
Fig. 1.Mean arterial oxygen saturation and heart rate at rest and cycling at 30 W and 150 W under normoxic (20.9% oxygen) and hypoxic (11.8% oxygen) conditions.Solid red lines and circles denote normoxia.Dashed blue lines and squares denote hypoxia.Asterisks denote a statistically significant effect of hypoxia (P < 0.05).Data are mean ± SD.

Fig. 2
shows all recorded S p o 2 data (at rest and while cycling) for each of the respective devices during normoxia and hypoxia.Under normoxic conditions, when values were successfully obtained, the S p o 2 data from the Apple Watch 6 [t(4) = 0.5898, P = 0.6] and Oxitone 1000M [t(4) = 1.215,P = 0.3] were not significantly different from reference data obtained from the traditional pulse oximeter.However, S p o 2 readings from the Garmin Fēnix 6 [t(4) = 4.867, P = 0.008] and Cosinuss o Two [t(4) = 3.964, P = 0.017] were significantly different from the corresponding reference data.During hypoxia, the Cosinuss o Two [t(4) = 0.3653, P = 0.7] was the only device to provide S p o 2 measurements that were not significantly different from the reference data; the Apple Watch 6 [t(4) = 8.025, P = 0.001], Garmin Fēnix 6 [t(4) = 4.094, P = 0.015], and Oxitone 1000M [t(4) = 3.812, P = 0.019] data were significantly different from the reference data.Equivalent data for heart rate is shown in the supplementary online appendix (Fig.A1, found with the online version of this article or at https://doi.org/10.3357/AMHP.6078sd.2023).

Fig. 2 .
Fig. 2. Arterial oxygen saturation measured by the reference pulse oximeter and wearable devices during normoxia (red boxes) and hypoxia (blue boxes).Data are from all conditions combined (rest and cycling).The mean, interquartile range (boxes) and maximum and minimum values (bars) are shown.Asterisks denote a statistically significant difference (P < 0.05) between reference data obtained from the traditional pulse oximeter and data from the respective wearable devices.

Table I .
S p o 2 Measurements: Number of Data Points, Percentage of Missed Readings, Mean Absolute Percentage Error and Percentage Accuracy for Each Device Measuring S p o 2 at Rest and During Cycling at 30 W and 150 W.
missed readings ranged between 2.5% and 20%, while during very low intensity cycling at 30 W, when associated body motion was very minimal, most devices missed most readings (range 12-82%).During moderate intensity cycling at 150 W, the percentage of missed readings ranged between 18% and 95%.Overall, the percentage of missed readings was lowest for the Cosinuss o Two and highest for the Oxitone 1000M.MAPE and percentage accuracy were calculated and are shown in TableI.With increasing cycling intensity, MAPE increased and percentage accuracy decreased.The Apple Watch 6 displayed the highest percentage accuracy independent of motion status, while the Garmin Fēnix 6 showed the lowest percentage accuracy.Equivalent data for heart rate is shown in TableII.Missed heart rate readings were generally less frequent, while overall, from rest to 150-W cycling, MAPE increased and percentage accuracy decreased.

Table II .
Heart Rate Measurements: Number of Data Points, Percentage of Missed Readings, Mean Absolute Percentage Error (MAPE) and Percentage Accuracy for Each Device Measuring Heart Rate at Rest and During Cycling at 30 W and 150 W.