Monte Carlo analysis of optical heart rate sensors in commercial wearables: the effect of skin tone and obesity on the photoplethysmography (PPG) signal

Commercially available wearable devices have been used for fitness and health management and their demand has increased over the last ten years. These “general wellness” and heart-rate monitoring devices have been cleared by the Food and Drug Administration for over-the-counter use, yet anecdotal and more systematic reports seem to indicate that their error is higher when used by individuals with elevated skin tone and high body mass index (BMI). In this work, we used Monte Carlo modeling of a photoplethysmography (PPG) signal to study the theoretical limits of three di erent wearable devices (Apple Watch series 5, Fitbit Versa 2 and Polar M600) when used by individuals with a BMI range of 20 to 45 and a Fitzpatrick skin scale 1 to 6. Our work shows that increased BMI and skin tone can induce a relative loss of signal of up to 61.2% in Fitbit versa 2, 32% in Apple S5 and 32.9% in Polar M600 when considering the closest source-detector pair configuration in these devices. © 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement


Introduction
Commercial wearable devices such as fitness trackers, smart health watches, and wearable blood pressure monitors help the user to generate data on their personal health. Many of these wearables monitor heart rate using Photoplethysmography (PPG). PPG is a low cost and non-invasive technique that detects blood volume change in living tissue by measuring the amount of light absorbed by blood in the microvascular bed [1,2]. Wearable devices continuously monitoring heart rate can potentially reduce the complications of cardiovascular disease through early detection of abnormalities in heart rhythm [3]. Thus, such technologies can revolutionize the field of preventative medicine while improving care to vulnerable communities [4]. The global wearable technology market is expected to have a value of 104.39 billion USD by 2027 [5]. On top of that, the Food and Drug Administration (FDA) clearance of smart phone applications [6] can further increase consumers use of these devices for monitoring health. While the demand of wearable technologies is booming, it is important to evaluate their accuracy in monitoring health.
Both research and anecdotal evidence indicate that the accuracy of wearables may be compromised in individuals with higher skin melanin concentration while conducting various physical activities [7][8][9][10][11][12]. Notably, studies concluded that inaccurate heart rate measurements may occur in people with darker skin as well as higher levels of obesity [13][14][15][16][17]. In contrast to these findings, recent experimental studies seem to indicate that although activities may lead to inaccuracy in heart rate measurements using commercial wearable devices, skin tone is not a source of error [9,18].
To improve our understanding of the e ects of skin tone and levels of obesity on the accuracy of heart rate measurements with commercial wearable devices and to facilitate the recent debates on such topics [7,9,18,19], we performed numerical evaluation of PPG signals on skin with various melanin concentration and levels of obesity. In this study, we incorporate both vascular networks within the multi-layered skin anatomy and the designed parameters (such as wavelengths and source-detector distance) of commercial PPG sensors in Apple Watch series 5, Fitbit Versa 2 and Polar M600 into Monte Carlo simulations.
More specifically, we simulate a skin geometry with dual vascular networks within the dermis to study the volumetric change in blood while considering various skin tone and levels of obesity. Subsequently, the pulsatile contribution of the vessels and the PPG waveforms from the wrist region were extracted.

Materials and methods
The assessment of PPG sensors' specifications, including the geometry of the sources and detectors, and the illumination wavelength of the sources were obtained through reverse engineering process of the devices under evaluation (Apple Watch series 5, Fitbit Versa 2 and Polar M600). Both size of the components and physical distance between the components were measured with calipers and digitally with ImageJ software. The illumination wavelengths were measured with a spectrophotometer (CCS175, Thorlabs Inc., Newton, NJ; STS-NIR, Ocean Insight, Orlando, FL). This information was integrated into Monte Carlo that simulated a PPG signal.

PPG sensor specifications
Specifications of the PPG sensors in the devices are provided in Table. 1. The schematics of the sensor architecture in Apple S5, Fitbit Versa 2 and Polar M600 are given in Fig. 1, Fig. 2 and Fig. 3, respectively. All distance units are in millimeter (mm).     [20,21]. The MC algorithms used in this study is an adapted version of Jacques's "mcxyz.c" [22] by Martiet al. called MCMatlab [23]. We developed two MC simulation scenarios to represent the non-obese and obese skin optical properties. The MC simulations incorporates a 1.5cm x 1.5cm x 0.3 cm frame at 500 ⇥ 300 ⇥ 3000 elements in each direction. Each of our simulations was carried out with 1 billion photons. An example of the layer composition of the geometry is shown in Fig. 4. In this study, the geometric framework of skin for testing the PPG devices is composed of four distinct layers: the epidermis, upper dermis, lower dermis, and subcutaneous fat tissue layers. The associated optical properties of the multilayer skin geometry are given in Table 2.

Skin tone
Variation in skin tone is manifested by changing the melanin content in the epidermal layer. The Fitzpatrick Scale, a numerical classification for human skin color, is used for this purpose [28]. Ranging from type 1-6, the Fitzpatrick skin types is assumed to have 3%, 10%, 16%, 23%, 32% and 42% volume fraction melanin in the modelled epidermal layer [24]. A darker skin tone is an implication of higher melanin content in the epidermis. Melanin shows relatively higher absorption in the UV-visible region of light, and it is also expected that a higher melanin concentration results in increased absorption of photons.

Obesity
Higher BMI has a significant impact on the thickness of various layers of the skin [29,30]. For example, Derraik et al. [31] reported that dermal thickness, in non-obese and obese individuals, can vary between 1.0 mm to over 2.5 mm. In addition, changes in physiological parameters due to obesity such as the loss of water can change in the concentration of absorbers and scatterers in the skin [32,33]. Water in the dermis can be reduced via Trans-Epidermal Water Loss (TEWL), a process in which water evaporates out of the skin through the epidermis. Lo er et al. reported about 66.67% TEWL on the anterior forearm in individuals with higher BMI [34]. Similarly, a variation in blood flow can also be a ected by obesity. Rodrigues et al. [35] reports 24% less blood flow in the forearm of obese individuals than non-obese individuals. These changes in anatomical and physiological parameters lead to varying optical properties of the skin and may influence the measurements of the wearable devices. Thickness of each layer of the skin in our geometry for non-obese (BMI = 20) and morbidly obese (BMI = 45) skin types are given in Table 3. The variations in the physiological parameters are applied to the MC geometry by changing the media layer properties. The TEWL percentage and percentage change in blood due to obesity is multiplied to the baseline (non-Obese) water content and blood content in the dermis, respectively. A summary of the physiological parameters of non-obese and morbidly obese skin types utilized in the modeling are given in Table 4. The vessel density in the dermis is 120 vessels per mm 2 with a ratio of 2:1 vessel in the upper to lower plexus [38,39]. The average resting diameter of a vessel in the upper plexus (terminal arterioles) is 30 µm and the diameter at maximum dilation is taken to be 47 µm. Whereas the average diameter of a vessel in the lower plexus is 100 µm [36,40] while its diameter at maximum dilation is 140 µm. Based on the physiological factors, 80 vessels in the upper plexus and 40 vessels in the lower plexus were considered. Ultimately, simulations conducted with 80 vessels showed equivalent results to ones conducted with an e ective layer of blood (thickness of 30 µm), hence subsequent simulations were conducted with the latter geometry. The pulse contribution from the upper plexus is simulated by a volume change in the z-direction for an equivalent blood layer as proposed by our group in previous work [17]. The equivalent blood layer in the upper plexus of the dermis (representing terminal arterioles) also is porous to consider the intravascular spaces through which photons may potentially escape without interacting with the vessels.

Results and discussion
Typical PPG signals collected with our simulations in case of the closest source-detector (S-D) pair in the wearables for non-obese (NOB) skin tone 1 and skin tone 6 (named NOB D1S1 for skin tone 1 and NOB D1S6 for skin tone 6) are shown in Fig. 5. Overall, the absolute absorbance for a full PPG wave increases from S1 to S6 although the normalized absorbance remains the same (Fig. 5(b)). The error bar from the Polar M600 is higher than the others due to the relatively larger S-D separation distance, reducing signal-to-noise ratio when considering the same number of launched photons. The pulsatile features of the PPG signal waveform are characterized by analyzing the AC to DC signal ratio of the waveform. Figure 5(c) demonstrates how AC/DC ratio is calculated, i.e., the range of signal amplitude divided by the minimum value of the signal. AC component of the signal represent the pulsatile changes in blood volume and is the peak of the PPG waveform whereas DC component is the baseline of the waveform, reflecting the constant light absorption. A similar trend is seen in case of morbidly obese (MOB) skin with Fitzpatrick skin type 1 and skin type 6 (named MOB D1S1 for skin tone 1 and MOB D1S6 for skin tone 6) although the waveforms exhibit a lower absorption as shown in Fig. 6.

Broadband simulation for Apple watch
The above simulations and results assume that the spectrum of each LED is a delta function centered at the dominant wavelength. Unlike lasers with narrow band illumination spectrum, LEDs have broad illumination spectrum over which optical properties dramatically change. For example, illumination spectrum of green LEDs in Apple watch (Fig. 1) has a FWHM of 32 nm.
To account for the broadband e ect of LED at source-detector (S-D) distance of 3.32 mm (closest S-D pair configuration in Apple S5), we performed simulations at all LED wavelength (corresponding to optical properties at each wavelength), and then weighted the reflectance at each wavelength to the LED probability distribution function [41,42]. The final reflectance (R final ) used in PPG evaluation is the sum of the product of weighting factor (w) and reflectance (R) over all n wavelengths: Reflectance values were extracted from both non-obese skin type 1 and obese skin type 6 models and the corresponding AC-DC ratio were calculated. The AC-DC ratio in the case of non-obese skin type 1 derived to be 7.9%, the AC-DC ratio of PPG waveform in the case of obese skin type 6 was derived to be 5.2%. This accounts for a 34.2 ± 0.67% change in the AC-DC ratio between the two cases. This number is approximately the same to that of single wavelength simulation for the same scenario (shown later in section 3.4). Given that the broadband simulation takes 8 days for one skin tone (versus 24 hours in single wavelength simulation), all the simulations below assume that LED illumination spectrum has a spectral beam shape of the delta function.  The AC/DC ratio of the PPG signal from skin with NOB and MOB optical properties at single wavelength are given in Fig. 7. As expected, the AC/DC ratio decreases from skin type 1 to 6 due to the increase of melanin absorption. An overall reduction in AC/DC ratio values in morbidly obese cases compared to non-obese case is noted. The increase in the DC signal along with a reduction in AC signal is due to the position of the blood vessels in the dermis -deeper compared to non-obese cases -as well as other changes in the obese optical properties. Interestingly the AC/DC ratio of PPG signal from the MOB skin in case of Fitbit closest S-D pair is considerably lower than any other S-D pair.

Skin tone dependent PPG measurements
To distinguish and understand the e ect of skin tone on PPG signals, the AC/DC ratio percentage change (k) from skin tone 1 to 6 was calculated and compared for morbidly obese (MOB) and non-obese cases (NOB) as shown in Eq. (2). A higher k value indicates a bigger loss in PPG signal at skin tone 6.
We observed an overall increase in the AC/DC ratio percentage change in all the devices as the S-D separation increases. In Apple watch for the closest S-D pair configuration, k-value between skin type 1 and skin type 6 for NOB case is 6.6± 0.5%. For the same configuration but with MOB case, the k-value is 2± 0.8%. For the farthest S-D pair configuration in Apple S5, k-value between the skin types with NOB case is 7.46 ± 10.16% and with MOB case it is 5.6 ± 2.1%. When it comes to Fitbit Versa 2, the closest S-D pair shows a k-value of 2.4 ± 0.2% for NOB case while MOB case it is 6 ± 0.27%. Similarly for the farthest S-D pair (Fitbit D2), the k-value between skin types for NOB case is 15.3 ± 33.6% and for MOB case it is 8.5 ± 4.08%. This higher error in Fitbit D2 can be attributed to the low signal-noise ratio at longer S-D separation distance compared to any other device. With one detector at the center, Polar watch shows comparably less change in AC-DC ratio between skin tones, with k-value for NOB case is 2.7 ± 5.3% and for MOB case is 7.5 ± 1.5%. This can be because of its increased size of sources and detector area. Overall, our simulations show that the e ect of skin tone on PPG signal is small, accounting for no more than 15% decrease in PPG signal.

Obesity dependent PPG measurements
An evaluation of the e ects of obesity on the PPG signal was also carried out. The percentage change in AC/DC signal ratio of PPG waveform due to increased obesity (k-value between obese cases) is higher than the e ect of skin tone alone, with a maximum of 60% in Fitbit. Furthermore, the percentage change in the AC/DC signal ratio between non-obese and obese cases is higher for the closest S-D pair (i.e., 30.6% vs 20.9% in Apple watch for skin tone 1 for closest and farthest S-D pairs respectively). This is due to the shallow penetration depth associated with this source-detector separation. The k-value between obese cases is also high in case of Fitbit V2 closest S-D pair in the range of non-obese to morbidly obese cases with a smaller AC/DC signal ratio ( Fig. 7(a)). Polar watch also shows a considerable di erence in the PPG signal.

Combined effect of skin tone and obesity on PPG measurements
We then calculated the percentage change in AC/DC signal ratio of the PPG waveform between lowest melanin concentration in epidermis combined with non-obese skin optical properties (NOB-S1) and highest melanin concentration in epidermis combined with morbidly obese skin optical properties (MOB-S6). In the case of Apple S5 closest S-D pair, a 32 ±0.55% change is observed in the AC/DC signal ratio in the PPG waveform between skin type 1 in NOB and skin type 6 in MOB cases. For the same S-D pair configuration, simulation carried out for source with the broadband spectrum shows 34.2 ±0.67% change in the AC/DC signal ratio between NOB skin type 1 and MOB skin type 6. The percentage change between the extreme cases recorded at the farthest S-D pair in Apple S5 is 25.3 ± 5.6%. The closest pair of source and detector in Fitbit Versa 2 shows the largest change with 61.2 ± 0.21% between the AC/DC signal ratio from skin type 1 in NOB and skin type 6 in MOB. While it is 25 ± 13.77% for the farthest S-D pair configuration in Fitbit. Similarly, in case of Polar M600, the percentage change in AC/DC ratio between the extreme cases is 32.9 ± 2.53%. These significant percentage change between AC/DC ratio of the PPG waveform from combined e ect of skin tone and obesity can be realized by comparing the waveforms from these cases. An example for such a case in Fitbit Versa 2 is shown in Fig. 8.   Fig. 8. Comparison of PPG waveform between skin type 1 in non-obese case (NOB-S1) and skin type 6 in morbidly obese case (MOB-S6) for the closest S-D pair in Fitbit Versa 2.
3.5. Combined effect of skin tone and obesity on PPG measurements using an IR light source Simulations were also carried out for the IR light sources utilized by the Apple S5 and Fitbit Versa 2. Figure 9 shows the AC to DC signal ratio of the PPG waveforms extracted from skin type 1 and skin type 6 using the IR source-detector pair configuration in Apple S5 and Fitbit Versa 2, respectively. These analyses were conducted for both non-obese and morbidly obese skin. The absorption of blood at IR wavelength is lower than the absorption obtained with the green light wavelength. This is reflected in the results given in Fig. 9 as they show a considerably smaller value for the AC to DC signal ratio of the waveforms. However, these results are also following the same trend that was seen in case of light sources with wavelength around 523 nm. A reduction in the AC to DC signal ratio is observed as we move from skin with lower melanin concentration to skin with higher melanin concentration. The calculated results of the e ect of higher melanin concentration and higher level of obesity in the case of IR emitter shows that all the S-D pairs we studied have higher percentage change in the AC/DC signal ratio when compared to the results from a green light emitter. In case of Apple S5 IR sources, the percentage change in AC/DC ratio between NOB-S1 and MOB-S6 cases are 59.6 ± 0.78%, 59.4 ± 0.49% and 56.6 ± 0.54% for the closest, symmetric, and farthest S-D pair configurations, respectively. Likewise, in cases of Fitbit Versa 2 this percentage change is 62 ± 0.3% for the closest S-D pair and is 56.6 ± 1.67% for the farthest S-D pair. The lower absorbance of blood in the IR range can lead to poor monitoring of dermal blood volume changes. This could be associated with the observed increase in percentage change in the AC/DC signal ratio between the extreme cases when using an IR emitter rather than a green light source.

Conclusion
We computationally analyzed the performance of three wrist-worn commercially available devices by using a Monte Carlo model of light transport to extract the PPG waveform. We considered di erent skin tones and di erent levels of obesity. The prime source of PPG signal in wrist-worn devices is the superficial vascular network [43]. We have presented a new model that represents the microvasculature in that region and used it to study the influence of skin tone and obesity associated variations to physiological parameters on the PPG waveforms. The results showed that, if we only consider the change of skin tone and neglect the changes in obesity, the PPG signal (AC/DC) displays less than 10% change over all devices. When obesity is considered in the simulation separately or together with skin tone, a maximum of 61.2% change in PPG AC/DC signal was observed. This suggests that a combined e ect of obesity and higher melanin concentration could contribute to the errors as previously reported [7]. The loss of signal observed in our simulations appears to be strongly related to the increase in dermal thickness occurring with elevated BMI. This could indicate that the source detector separation currently used in these devices may not be su cient to sample the vascular plexus at the same rate for obese and not-obese individuals. Based on these observations, we can postulate that a longer S-D separation in the PPG sensor, that enables an increased interrogation depth, better helps to capture blood volume change in case of obese skin.
Our study was limited to observations of photons detected by the instruments' detectors as the simulated signal amplitude neglected any information related to device-specific postprocessing signal analysis, which is proprietary. Overall, the drastic decrease in signal shown for the extreme case of type 6 skin tone and BME larger than 40 appears problematic but could indeed be mitigated in postprocessing through algorithms or machine learning approaches. It is to be noted that none of our simulations reduced the signal to absolute zero and a peak could still be observed in all cases -which could be theoretically used for measurement of heart rate -nevertheless the detectors' noise levels were unknown and considered zero in our simulations yielding an ideal SNR assessment.
Similarly, we did not include any evaluation of the sources power as the wearable detectors' specification were not available, making a complete analysis of the instrument error unfeasible. Our analysis is based on limited published data on obese skin optical properties as well as on our extrapolation of the impact of physiological changes on the skin. Future studies will focus on the experimental characterization of optical properties associated with BMI increase.
Finally, the shape of the PPG signal has been recently used to derive blood pressure information through artificial intelligence and other approaches [44]. Non-invasive and continuous measurement of blood pressure are desirable for management of cardiovascular disease and research in this context has been explored by many companies developing wearables including the one reviewed here. Our study has shown that the shape of the PPG signal is strongly impacted by obesity and skin tone. For example, the dichroic notch is minimized for MOB and skin type 6, which would make the devices unsuitable for extrapolation of blood pressure through PPG.