Generalized channel separation algorithms for accurate camera-based multi-wavelength PTT and BP estimation

Single-site multi-wavelength (MW) pulse transit time (PTT) measurement was recently proposed using contact sensors with sequential illumination. It leverages different penetration depths of light to measure the traversal of a cardiac pulse between skin layers. This enabled continuous single-site MW blood pressure (BP) monitoring, but faces challenges like subtle skin compression, which importantly influences the PPG morphology and subsequent PTT. We extended this idea to contact-free camera-based sensing and identified the major challenge of color channel overlap, which causes the signals obtained from a consumer RGB camera to be a mixture of responses in different wavelengths, thus not allowing for meaningful PTT measurement. To address this, we propose novel camera-independent data-driven channel separation algorithms based on constrained genetic algorithms. We systematically validated the algorithms on camera recordings of palms and corresponding ground-truth BP measurements of 13 subjects in two different scenarios, rest and activity. We compared the proposed algorithms against established blind source separation methods and against previous camera-specific physics-based method, showing good performance in both PTT reconstruction and BP estimation using a Random Forest regressor. The best-performing algorithm achieved mean absolute errors (MAEs) of 3.48 and 2.61 mmHg for systolic and diastolic BP in a leave-one-subject-out experiment with personalization, solidifying the proposed algorithms as enablers of novel contact-free MW PTT and BP estimation.


Introduction
Blood pressure (BP) estimation is highly valuable as the most commonly measured vital sign by clinicians at both primary and secondary healthcare [1].It should be monitored regularly, especially for people at higher risk of cardiovascular disease (e.g., elderly population) [2].Sphygmomanometry remains the most commonly used BP measurement technique [1], which depends on a wearable inflatable cuff and a specific measurement protocol, as the cuff must be placed at heart height of a stationary subject, neither too loosely nor too tightly.The measurement is cumbersome as it includes cables and near-mandatory inclusion of a second person for the cuff placement, as self placement is rarely suitable.Such pre-measurement difficulties can influence the BP of the subject, causing an incorrect measurement either due to incorrect placement, frustration or white-coat syndrome [3].This is resolved by proposed alternative methods that do not require a sphygmomanometer and instead use alternative physiological signals to derive information about subject hemodynamics, allowing for continuous and long-term BP monitoring [4].The main enabler at the forefront of cuffless BP estimation is photoplethysmography (PPG), which reflects cardiovascular activity and blood perfusion via color changes of the skin [5].Such subtle changes can be captured using a simple setup comprising a photodiode and a light source.Attempts at cuffless BP estimation include either a) detailed analysis of the PPG waveform morphology [4] or b) measurement of the pulse transit time (PTT) between multiple sites (e.g., palm and forehead) using either an ECG and PPG or two PPG sensors, as PTT is heavily correlated to BP [6].Recently researchers also resolved the requirement of multi-site measurement by leveraging multi-wavelength (MW) PPG approaches when using contact sensors [7].The idea is to use PPG reconstructions from different skin depths obtained using light sources with different wavelengths.Such waveforms are expected to be slightly delayed due to the pulse wave traversal from deeper skin layers with larger arterial vessels such as arterioles towards the upper layers containing capillary loops, which allows for computation of PTT [8].
Despite resolving the need for using a sphygmomanometer, cuffless approaches still require wearable sensors and thus retain many challenges, such as cable or battery dependence, inability to use due to potential skin conditions like burns or sensitive skin of neonates, and general reluctance of users towards having to continuously carry wearables.Contact-free monitoring of people's medical condition has thus seen great increase in popularity supported by state-of-the-art results for many physiological parameters, such as heart and respiratory rate (HR and RR) [9], oxygen saturation (SpO2) [10], glucose levels [11] and also BP [12].Such contact-free physiological monitoring is most often based on remote photoplethysmography (rPPG), which uses the same underlying principle of changes in skin color due to blood volume changes as traditional PPG, but obtains the data remotely using a camera.State-of-the-art research succeeds in remotely estimating some highly-expressed physiological parameters, such as HR, in a robust and accurate manner, achieving errors as low as 1-3 beats-per-minute (BPM) [13].Other more subtle parameters, such as the previously discussed BP, remain more difficult to estimate.This is due to both physiological factors -higher complexity of hemodynamics depending on many factors (vascular stiffness, blood composition, fat deposits, etc.) -as well as technical limitations -BP estimation from PPG commonly relies on precise morphology, requiring diastolic peaks and notches to be precisely reconstructed [4].
In this work we propose and upgrade a novel contact-free BP estimation system, which measures MW PTT between different skin layers using a modified RGB camera.It reconstructs rPPG waveforms from different layers of skin at the palm, using different parts of the light spectrum with different penetration.Palm was chosen as the measurement site due to its good ratio between skin thickness and perfusion, being one of the areas with thickest skin, even without epidermis [14,15].Additionally, this was more comfortable for subjects, as we used strong dedicated light sources that emitted heat, which is uncomfortable if directed in the face.Finally, palm skin also has consistently lowest (with soles of the feet) melanin content regardless of skin type [16],which makes it more a more robust region for subjects with different skin tones.The obtained MW PTTs were initially not informative due to high channel overlap as a consequence of imperfect design of camera image sensors.We thus propose camera-independent data-driven channel separation algorithms that allow for measurement of informative PTTs, which can in turn be used for accurate BP estimation.

Related work
In previous section we discussed the underlying mechanisms enabling the approaches used in fundamental related work from this field, focusing especially on PTT measurement and subsequent BP estimation.Here we focus on more recent examples of such methods in literature, especially focusing on MW PPG reconstruction.We also provide a synthesis, grouping the methods and highlighting their unresolved challenges.
A multi-site contact MW approach was proposed by Karolcik et al. [17] in which a wrist and finger sensors were used to acquire 4 PPG channels from different wavelengths.Using these waveforms they estimated HR, SpO2 and PWV with mean errors of 4.08 BPM, 1.54% SpO2 and 5.80 m/s respectively, matching values found in literature.
Li et al [18] proposed dynamic spectroscopy method based on MW PPG proportion extraction, which realizes the extraction of dynamic spectroscopy by calculating the proportion coefficient between PPG waveforms.They obtained the signals using contact finger sensor and reported decreases in error by up to 31.56%.
Huang et al. [19] investigated the best wavelengths for camera-based MW PTT computation, as proposed earlier by Slapniçar et al. [20].They confirmed, in an experiment involving 17 participants, that using PTT between green and infrared band is better for BP estimation compared to using regular red band, as correlations between PTT and BP are higher and errors lower, achieving mean errors of 5.78 and 6.67 mmHg for SBP and DBP respectively.
It was already shown in earlier works by Liu et al. [7], that measuring PTT between different skin layers using MW approach is feasible for BP estimation, however since the approach was contact, the influence of skin compression was not considered [21,22].Similarly, contact-free multi-site PTT measurement was discussed at length in a review paper by Chan et al. [23].They found 13 papers between 2010 and 2019, which proposed multi-site PTT computation for BP estimation, relying both on contact and remote sensors.In both cases the idea is the same, as the delay of a single pulse was measured at different sites and correlated to BP.They also highlighted potential integration of sensors in smart environments (e.g., cameras), which could lead to unobtrusive remote BP estimation.
The research discussed in this and previous section offers great promise towards seamless unobtrusive BP monitoring, however, some concerns remain for each approach identified in literature: 1. Multi-site contact PTT approaches [24] using wearable sensors require two precisely synchronized sensors with good skin contact to obtain high-quality waveforms for reference point detection and subsequent PTT computation.This requires two devices at two skin locations and cannot be implemented on a single compact device.Furthermore, wearable sensors are battery dependant and cannot be used by people with specific skin conditions.
2. Camera-based multi-site contact-free PTT approaches [25] can be implemented with a single sensor (camera), but require a good consistent exposure of two pre-determined monitored regions of interest (ROIs, commonly forehead and palm), which imposes requirements and restrictions to a subject's positioning, making it impractical.Furthermore, when these ROIs are not precisely fixed in a camera frame, tracking and segmentation is required which introduces additional algorithmic and computational requirements.
3. Single-site contact MW PTT approaches [7] require specialized hardware, including image sensor sensitive to the required wavelengths and a light source capable of producing said wavelengths (typically narrow-band LEDs).While this is an improvement over using two sensors, as it omits the requirement for precise synchronization and additional power, it is not readily available in existing devices and still suffers from the skin contact requirement.Importantly, wearable sensors also inherently compress the skin slightly in an effort to maintain good skin-sensor contact.Such compression can substantially distort the PPG waveform [21], making it less reliable [22] especially in the upper skin layers corresponding to shorter wavelengths.

4.
Single-site PPG morphology approaches [4] are used both with contact and contact-free optical sensors and rely on precise morphological analysis of the (r)PPG waveform on per-cycle basis.This means that there is reliance on consistently high-quality waveforms and not only a single prominent reference point (e.g., systolic peak), as many features require consistent detection of the diastolic peak and even dicrotic notch, which are often very difficult or impossible to obtain outside of highly-controlled lab environment with high-quality contact sensors [26].This makes many widely-investigated features such as augmentation index, stiffness index, systolic and diastolic times, amplitude ratios etc.
[27] less useful when the waveform is not ideal.Furthermore, there is some debate in the community on which morphological features are universally performing well, as the underlying connection with BP is not as clear as with PTT.To circumvent the required explicit definition and computation of features, many people rely on black-box models (neural networks) to internally compute features in end-to-end approaches, however, there is an even larger lack of well-understood connection between such neural-network-derived features and BP.The approaches relying on PPG morphology can be used in either contact or contact-free manner, as only a single (r)PPG signal is needed.However, due to aforementioned factors, the performance and feasibility diminishes quickly when moving from high-quality contact sensors to (consumer) RGB cameras, as the waveform details become less apparent.

Problem, contributions and significance
Our aim was to develop and validate a contact-free MW optical BP estimation system that would offer the individual advantages of the existing previously described approaches, without their identified disadvantages.We thus proposed [28] to embed the principle of contact MW approach into a contact-free camera modality, to measure single-site PTT without the requirement for custom wearable hardware and without a wearable sensor compressing the skin at all [21].However, despite the advantages of such a hybrid method, a major challenge arises when attempting to use the MW approach in a contact-free scenario with a consumer RGB camera -channel overlap due to the design of the Bayer filter on the image sensor.Image sensors are governed by quantum efficiency, telling how many photons of a given wavelength are registered by the pixels sensitive to this specific wavelength.The Bayer filter on top of the sensor in consumer RGB cameras is primarily designed to simulate the human eye rather than capture physiological information [29].This means, that the image sensor exhibits overlapping response in spectral bands of visible light, causing for instance pixels sensitive to the green wavelengths (e.g., around 550 nm) to also respond partially to light in the neighbouring blue wavelengths (e.g., around 475 nm), as seen in Fig. 1. community on which morphological features are universally performing well, as the underlying connection with BP is not as clear as with PTT.To circumvent the required explicit definition and computation of features, many people rely on black-box models (neural networks) to internally compute features in end-to-end approaches, however, there is an even larger lack of well-understood connection between such neural-network-derived features and BP.The approaches relying on PPG morphology can be used in either contact or contact-free manner, as only a single (r)PPG signal is needed.However, due to aforementioned factors, the performance and feasibility diminishes quickly when moving from high-quality contact sensors to (consumer) RGB cameras, as the waveform details become less apparent.

Problem, Contributions and Significance
Our aim was to develop and validate a contact-free MW optical BP estimation system that would offer the individual advantages of the existing previously described approaches, without their identified disadvantages.We thus proposed [28] to embed the principle of contact MW approach into a contact-free camera modality, to measure single-site PTT without the requirement for custom wearable hardware and without a wearable sensor compressing the skin at all [21].
However, despite the advantages of such a hybrid method, a major challenge arises when attempting to use the MW approach in a contact-free scenario with a consumer RGB camera -channel overlap due to the design of the Bayer filter on the image sensor.Image sensors are governed by quantum efficiency, telling how many photons of a given wavelength are registered by the pixels sensitive to this specific wavelength.The Bayer filter on top of the sensor in consumer RGB cameras is primarily designed to simulate the human eye rather than capture physiological information [29].This means, that the image sensor exhibits overlapping response in spectral bands of visible light, causing for instance pixels sensitive to the green wavelengths (e.g., around 550 nm) to also respond partially to light in the neighbouring blue wavelengths (e.g., around 475 nm), as seen in Figure 1.Contact MW approaches use specialized sequential illumination from different narrow-band LEDs, which allows them to independently reconstruct PPG waveforms from corresponding pixels at each activation of a specific-wavelength LED [8].While this avoids the inherent channel overlap problem of the image sensor, it in turn requires precise and consistent synchronization Contact MW approaches use specialized sequential illumination from different narrow-band LEDs, which allows them to independently reconstruct PPG waveforms from corresponding pixels at each activation of a specific-wavelength LED [8].While this avoids the inherent channel overlap problem of the image sensor, it in turn requires precise and consistent synchronization using time-multiplexing (at sample level) between the image sensor and light source -at the moment of each frame capture the right LED must emit the light.Furthermore, this also requires very high sampling frequency as only one sample of each wavelength is obtained every n cycles, where n is the number of wavelength bands of the light source (usually 3-4, e.g., blue, green, yellow and red [7]).
Using an RGB camera and a constant light source with a spectrum comprising all bands of interest lowers the sampling frequency requirement by a factor of n, as it captures information from all bands in each frame simultaneously, but requires special attention to the separation of color traces obtained at each time step due to previously described channel overlap and sensor response in the NIR part of the spectrum.This response in NIR is not a problem in everyday use, since off-the-shelf cameras are always equipped with an on-sensor IR-block filter.However, since the NIR part of the spectrum was shown to be important and more robust for rPPG reconstruction [22], and is also furthest away (out of feasibly collectable data with an RGB camera) from the commonly used green wavelengths, we customized our camera by removing the IR filter to also capture the NIR information.Another advantage of the camera is a much higher spatial resolution compared to wearable sensors, which offers more information from more pixels and allows for better potential selection and evaluation of different ROIs within a frame.
The importance of the problem described above is further amplified when measuring single-site PTT between different skin layers using a contact-free optical MW approach.Such PTT is inherently very short due to a very short distance the blood traverses (skin thickness of a few milimiters) and a relatively high pulse wave velocity [30], so the delays between the reconstructed waveforms are extremely small and very difficult to measure.This means that any distortion of the waveform, such as wearable sensor pressure, can importantly influence the measured PTT making it less correlated to physiological phenomena and more with skin compression.Using a contact-free optical sensor, this problem [21,22] is resolved.However, when using consumer RGB cameras for remote sensing, the channel separation step is often overlooked, while being of vital importance for consistent detection of single-site MW PTTs between different skin layers.
In this paper we validate on a larger dataset that explicitly separating channels as a first signal processing step is mandatory when measuring short PTTs between different skin layers.We propose novel algorithms for such channel separation based on constrained genetic algorithms, which generalize our previous camera-specific algorithm [28] by fusing expert knowledge of image sensor properties with data-driven algorithms.Initially we used camera-specific channel separation coefficients to design channel separation or demixing.This approach is very accurate and derived directly from the physical properties of the specific image sensor, but cannot be used generally without detailed knowledge of a specific image sensor.We thus propose two novel camera-independent data-driven channel separation algorithms, which enable general contact-free optical MW PTT and BP estimation, and validate them against the previous camera-specific physics-based approach and a baseline of using no channel separation.The proposed and evaluated approaches are: 1. Physics-based approach [28]: The channel separation algorithm that uses the quantum efficiency and spectral distribution of a camera-specific image sensor to determine the demixing coefficients to separate the channels [28].

Genetic algorithm with a BP regression model (GA-BP):
A general data-driven metaheuristic algorithm that optimizes channel separation parameters so that the error of a BP regression model is minimal.

Generalized genetic algorithm using phase delay (GA-PD):
A further generalized version of the previous algorithm that does not require an accompanying training of a BP regression model, as it instead maximizes the phase delay or rather minimizes the cross-correlation between channels, meaning that it finds the phase delay at which cross-correlation is the lowest.
We validate them using a systematic robust pipeline for contact-free optical MW BP estimation that serves as a framework to evaluate the performance and influence of the proposed channelseparation algorithms on the quality of a BP prediction model.
The main contributions are thus as follows: 1. Proposed general camera-independent data-driven algorithms for optimization of color channel separation that, to the best of our knowledge, for the first time enable contact-free optical MW PTT and BP estimation with a consumer RGB camera, 2. A complete systematic and robust video processing experimental framework for contact-free optical MW PTT and subsequent BP estimation, and 3. Detailed analysis and validation of the proposed algorithms on an original specialized dataset that offers relevant insights for such BP estimation and a benchmark for future studies.
Significance of the proposed algorithms is in enabling contactless single site BP estimation, which can lead to devices that could be used by clinicians for general screenings or continuous BP monitoring of subjects with specific conditions, like neonate infants and subjects with burns.This approach omits the need for a professional to equip a cuff or insert an intravenous catheter and enables more comfortable monitoring at a distance, without the need for several sites or exposing human face, as is common in existing contact-free multi-site PTT approaches.Furthermore, it does not require two (synchronized) sensors or high-resolution camera, and requires only a single small skin exposure site.This approach instead relies on established medical mechanisms of PWV and PTT, and can be considered more robust and reliable compared to using black-box morphology models or inconsistent wave features that can be difficult to detect (e.g., relying on diastolic peak or notch).Finally, compared to existing contact MW approaches, no external skin compression influencing the measurement is present.
In summary, the proposed algorithms enabling contact-free optical MW BP estimation resolve individual issues of other existing contact-free approaches (multi-site measurement, reliance on precise morphology, etc.) and show that existing MW contact approaches can be extended to contact-free scenarios under controlled conditions.
The rest of this paper is organized as follows: in Section 4 we give a brief overview of our recording setup and collected data.The following Section 5 details our proposed algorithms and experimental setup.In Section 6 we present the results, which we then discuss and conclude in Section 7.

Data collection
Human skin exhibits a layered structure comprising three major layers -epidermis, dermis and hypodermis -as shown in Fig. 2. The epidermis serves as a protection layer and is thus rough and poorly perfused with little vascular presence.The upper reticular dermis exhibits better perfusion as it contains the capillary loops, while the deeper part of dermis and hypodermis contain arterioles and tiny arteries on top of subcutaneous fat tissue.
Since good perfusion across layers is important for rPPG reconstruction, we decided to use the palm skin as our measuring site, as the skin there is not too thin (up to 2 mm) while being well-perfused [31], allowing us to obtain useful information from different depths including deeper at 250 frames-per-second (FPS) and selected the iDS U3-3040SE-Q camera, which allows for such high-frequency recording.As NIR is further away from green (not a neighbouring spectral band), offers deeper skin penetration, and was reported to be more robust for contact-free rPPG [22], we wanted to collect this data.NIR is obtainable using a traditional RGB camera by removing the factory-placed IR-block filter from the image sensor.In an effort to initially minimize band overlap, we also used a triple band-pass filter, which allows only light in narrow bands of 475±10 nm (blue), 550±10 nm (green) and 850±22 nm (NIR) to pass, as seen in Figure 1.Furthermore, human skin often exhibits sweating, especially when exposed to heat or after physical activity, which can result in specular reflections on the surface.We counteracted this by using the MidOpt PR1000 VIS/SWIR Wire Grid Linear Polarizer, which is effective in the range of 400-2000 nm.For light source we used two perpendicular (to minimize shadows) filament halogen bulbs that emit light in the full spectrum of interest, including NIR not present in consumer LED bulbs.We used a DC power source to eliminate potential bulb flickering, which is a major source of noise on recordings at such high FPS.The room had no other light source other than the bulbs, to eliminate interference.
We collected data of 13 volunteers, 10 male and 3 female with the mean age of 30 ± 3.2 years.
Most were healthy young adults, with two exceptions being older and long-time smokers.All have given explicit consent to participation and their data was anonymized.We prepared two recording scenarios with the aim of inducing substantial BP changes.First was resting in a seated position, where the subjects were taking deep breaths and relaxing.After one minute, a 30-second recording was made and their ground-truth BP was measured with a clinical-grade Omron cuff-based sphygmomanometer.The second scenario included intense physical activity, consisting of 1 minute of jumping jacks followed immediately by jump squats until failure.At failure, the subject was immediately measured, as they exhibit substantially elevated BP and HR.
We repeated this experimental protocol at least twice for each subject, preferably on different days (for some subjects this was impossible due to availability, but ten out of thirteen were measured on subsequent days with 24h between measurements and then third time a week later), to obtain more varied information, avoid daily specifics, and further validate robustness.This gave us at least two cases of two distinct hemodynamic states for each subject, meaning four measurements per subject in total was the minimum.Some subjects participated more (again depending on availability), so we obtained more recordings for majority of the participants, as they participated layers.Due to the aforementioned challenge of PTT being short, we required an exceptionally high sampling frequency of our RGB camera.In line with related work [7] and reported pulse wave velocities (PWVs) in capillaries of 6.4-17.6 mm/s [30], we placed a conservative requirement at 250 frames-per-second (FPS) and selected the iDS U3-3040SE-Q camera, which allows for such high-frequency recording.As NIR is further away from green (not a neighbouring spectral band), offers deeper skin penetration, and was reported to be more robust for contact-free rPPG [22], we wanted to collect this data.NIR is obtainable using a traditional RGB camera by removing the factory-placed IR-block filter from the image sensor.In an effort to initially minimize band overlap, we also used a triple band-pass filter, which allows only light in narrow bands of 475±10 nm (blue), 550±10 nm (green) and 850±22 nm (NIR) to pass, as seen in Fig. 1.Furthermore, human skin often exhibits sweating, especially when exposed to heat or after physical activity, which can result in specular reflections on the surface.We counteracted this by using the MidOpt PR1000 VIS/SWIR Wire Grid Linear Polarizer, which is effective in the range of 400-2000 nm.For light source we used two perpendicular (to minimize shadows) filament halogen bulbs that emit light in the full spectrum of interest, including NIR not present in consumer LED bulbs.We used a DC power source to eliminate potential bulb flickering, which is a major source of noise on recordings at such high FPS.The room had no other light source other than the bulbs, to eliminate interference.
We collected data of 13 volunteers, 10 male and 3 female with the mean age of 30 ± 3.2 years.Most were healthy young adults, with two exceptions being older and long-time smokers.All have given explicit consent to participation and their data was anonymized.We prepared two recording scenarios with the aim of inducing substantial BP changes.First was resting in a seated position, where the subjects were taking deep breaths and relaxing.After one minute, a 30-second recording was made and their ground-truth BP was measured with a clinical-grade Omron cuff-based sphygmomanometer.The second scenario included intense physical activity, consisting of 1 minute of jumping jacks followed immediately by jump squats until failure.At failure, the subject was immediately measured, as they exhibit substantially elevated BP and HR.We repeated this experimental protocol at least twice for each subject, preferably on different days (for some subjects this was impossible due to availability, but ten out of thirteen were measured on subsequent days with 24h between measurements and then third time a week later), to obtain more varied information, avoid daily specifics, and further validate robustness.This gave us at least two cases of two distinct hemodynamic states for each subject, meaning four measurements per subject in total was the minimum.Some subjects participated more (again depending on availability), so we obtained more recordings for majority of the participants, as they participated in one additional session, 1 week after the last session.The average number of measurements per subject was 5.54, as ten out of thirteen subjects had exactly six measurements, and three of them had four measurements, since they only participated in two rounds of recording scenarios.Average changes in BP between different scenarios for each subject are reported in Table 1.Each 30-second recording was assigned the ground-truth BP and HR values of the Omron device.This makes sense since the measurement itself takes around 20 seconds, and since BP and HR do not change that rapidly outside of extreme circumstances.Some data was discarded, as sometimes the Omron device returned an error meaning we did not have the ground truth BP.
There is a possibility of physiological changes happening to individuals between different days, which in turn influence BP.This is one of the reasons why we chose to record data at different times, to not overfit to daily specifics (e.g., stressful situation at work prior to recording).However, using data that is temporally far apart for training and testing can lead to unrealistic results or indicate requirement for extensive calibration (in different states).We thus investigated whether there is correlation between time between measurement and measured BP values, but did not observe any correlation in our recordings (R=0.02).We additionally checked for correlation between time between measurement and computed errors, and also did not observe significant correlation (R=0.03).

Evaluation framework and data (pre)processing
The architecture of our evaluation framework for contact-free MW BP estimation system is shown in Fig. 3 and is comprised of several parts.Fig. 3. Architecture of our evaluation framework for contact-free MW BP estimation that we used for evaluation of the channel separation algorithms.The green box highlights the important channel separation step that enables contact-free MW PTT measurement using the proposed algorithms.

Standard Preprocessing
Raw data was first manually debayered to obtain initial unprocessed images using custom code alongside the camera SDK.Afterwards, the three traces were obtained from the corresponding color channels (RGB) by spatially averaging the relevant pixels of the whole image (all of it was skin).The obtained traces were initially processed using a Hampel filter to remove outliers and then filtered using a zero-phase (to ensure no changes in the location of relevant reference points) Butterworth band-pass filter.This removed sensor noise, baseline drift and other irrelevant information outside of the selected conservative frequency band of [0.5, 3.0] Hz.The amplitude of the PPG from each channel was then normalized to the range of [-1, 1].
We got three final outputs after the preprocessing, one per color channel.These were always very similar in terms of amplitude and phase (exhibiting no shift) thus not allowing for PTT measurement.Subsequently these were fed into different channel separation algorithms to demix the channels and obtain individual color traces with consistently measurable and informative PTTs.

Channel Separation Model, Constraints and Methods
In general we can assume, as seen in Figure 1, that each color trace obtained from an RGB image sensor is a combination of the actual response in the relevant wavelength as well as undesired response in the other wavelengths, especially if the IR-block filter is removed from the camera.
Thus the RGB traces obtained from a camera can be written as given in Eq. ( 1) where , ,  are the obtained channel-overlapped traces, , ,  are the actual channelseparated responses and   ,   and   are the coefficients representing the ratios of each response present in the overlap.
Understanding the underlying physiology of the skin tissue and pulse wave propagation Fig. 3. Architecture of our evaluation framework for contact-free MW BP estimation that we used for evaluation of the channel separation algorithms.The green box highlights the important channel separation step that enables contact-free MW PTT measurement using the proposed algorithms.

Standard preprocessing
Raw data was first manually debayered to obtain initial unprocessed images using custom code alongside the camera SDK.Afterwards, the three traces were obtained from the corresponding color channels (RGB) by spatially averaging the relevant pixels of the whole image (all of it was skin).The obtained traces were initially processed using a Hampel filter to remove outliers and then filtered using a zero-phase (to ensure no changes in the location of relevant reference points) Butterworth band-pass filter.This removed sensor noise, baseline drift and other irrelevant information outside of the selected conservative frequency band of [0.5, 3.0] Hz.The amplitude of the PPG from each channel was then normalized to the range of [-1, 1].
We got three final outputs after the preprocessing, one per color channel.These were always very similar in terms of amplitude and phase (exhibiting no shift) thus not allowing for PTT measurement.Subsequently these were fed into different channel separation algorithms to demix the channels and obtain individual color traces with consistently measurable and informative PTTs.

Channel separation model, constraints and methods
In general we can assume, as seen in Fig. 1, that each color trace obtained from an RGB image sensor is a combination of the actual response in the relevant wavelength as well as undesired response in the other wavelengths, especially if the IR-block filter is removed from the camera.Thus the RGB traces obtained from a camera can be written as given in Eq. ( 1) where r, g, b are the obtained channel-overlapped traces, R, G, B are the actual channel-separated responses and a n , b n and c n are the coefficients representing the ratios of each response present in the overlap.
Understanding the underlying physiology of the skin tissue and pulse wave propagation properties [7], we can make some assumptions that simplify the system.The blue trace was discarded since it exhibits exceptionally noisy signal as well as makes little physiological sense due to lack of perfusion in the epidermis [33].Furthermore, we always consider the coefficient corresponding to the color we are trying to separate (e.g., a 1 for red and b 2 for green in Eq. ( 1)) to be 1.Finally, the remaining coefficients in Eq. ( 1) were limited to the range [-1, 0], as we are always subtracting undesired response from the mixture, never adding it.Considering these constraints, our model simplifies to Eq. ( 2)

Blind source separation
Commonly used method for blind source separation of linear mixtures as given in Eq. ( 1) are Principal Component Analysis (PCA) and Independent Component Analysis (ICA) [34].These are data-driven black-box methods and do not use any underlying understanding or models for the separation of channels.PCA requires the variation in the amplitude of the components to be sufficiently different to determine the eigenvector directions of demixing matrix, while ICA assumes that the sources are statistically independent and non-Gaussian [34].As discussed previously, the resulting waveforms after preprocessing remain mixtures of overlapping channels, as those are not very different from one another in terms of frequency.Subsequently, PCA and ICA can then be used to separate the overlapping source channels present in the mixture [35].However, we hypothesize that the above assumptions of PCA and ICA are problematic with respect to obtaining meaningful phase-delayed channel separated traces, as such traces cannot be said to be sufficiently different nor independent.Based on skin physiology and hemodynamics shown in Fig. 2 and described in related work [36], we know that the blood perfusion is interconnected and continuous throughout the cardiovascular system, including the skin layers, thus influencing one another.
We investigated both PCA and ICA to confirm or reject our above hypothesis and as benchmark methods that can intuitively be considered as potential candidate methods to resolve the channel overlap in contact-free MW monitoring.The results were compared against other channel separation methods and reported in Section 6.Given the model from Eq. (1), we considered the preprocessed R, G and B traces obtained from the camera as input and then always assumed we also want three output source components corresponding to demixed pulsatile color signals.

Physics-based approach
Given full information about precise quantum efficiency of the image sensor, filters used with the image sensor, and spectrum of the light source, one can derive the coefficients a n , b n and c n for channel separation directly based on the camera and light source physics.Looking at Fig. 1 and taking into account the filter bands, we can determine the relevant coefficients by computing the ratios of areas under the curves (AUC) of the three channels [28].Such a method is fully dependant on precise quantum efficiency and spectrum information, which are unique for each image sensor and light source.Our aim was to generalize this method and make it fully data-driven in the sense that we propose an algorithm that can be used without the image sensor and light source specifics known in advance [28].

Genetic algorithm using BP (GA-BP)
Our goal was to measure the PTT in order to train a regressor for BP estimation.It only makes sense to use such a regression model as a fitness function, if the importance of the PTT to the model is high, meaning it can serve as a meaningful physiological feature for BP estimation.We thus initially checked the average feature importances of the trained Random Forest regression models using the physics-based channel separation.We ran a leave-one-subject-out (LOSO) experiment with personalization, meaning that one instance of rest and one instance of activity of the left out subject was added to the training data.We checked the average importance of the PTT for SBP and DBP estimation as per mean decrease in impurity (MDI), and found that the PTT is extremely dominant compared to other features, as shown in Fig. 4.This shows that the model relies heavily on the PTT and makes sense to use as a fitness function to evaluate the quality of the PTT or rather the channel separation algorithm that facilitates its computation.The PTT for one recording was computed as the average PTT between NIR and green reference points in the whole 30-second recording with a single reference ground-truth BP measurement.Additionally, some commonly-used morphological (T c = cycle length, T s = systolic rise time, T d = diastolic rise time, AUC c = area under the curve of the whole cycle, AUC s = area under the curve of systolic rise, AUC d = area under the curve of diastolic slope) and demographic (age and sex) features were added to each instance for the training of the Random Forest regressor.
thus initially checked the average feature importances of the trained Random Forest regression models using the physics-based channel separation.We ran a leave-one-subject-out (LOSO) experiment with personalization, meaning that one instance of rest and one instance of activity of the left out subject was added to the training data.We checked the average importance of the PTT for SBP and DBP estimation as per mean decrease in impurity (MDI), and found that the PTT is extremely dominant compared to other features, as shown in Figure 4.This shows that the model relies heavily on the PTT and makes sense to use as a fitness function to evaluate the quality of the PTT or rather the channel separation algorithm that facilitates its computation.
The PTT for one recording was computed as the average PTT between NIR and green reference points in the whole 30-second recording with a single reference ground-truth BP measurement.
Additionally, some commonly-used morphological (  = cycle length,   = systolic rise time,   = diastolic rise time,   = area under the curve of the whole cycle,   = area under the curve of systolic rise,   = area under the curve of diastolic slope) and demographic (age and sex) features were added to each instance for the training of the Random Forest regressor.
After PTT importance was confirmed, we framed the problem described in Eq. ( 2) as an optimization problem, where one determines the optimal value of coefficients with respect to some fitness function.When choosing the fitness function, we initially decided to use the average mean absolute error (MAE) of accompanying trained Random Forest regressors predicting SBP and DBP.The regressor was evaluated each time in a 5-fold cross validation (CV) experiment without shuffling (instances of same subject stayed together to avoid overfitting within the CV), using the PTTs computed with a given candidate vector.Once the MAE of a regression model estimating BP was chosen as the fitness function, we attempted to find solutions to our optimization problem by using a genetic algorithm that trained a regression model for the computation of the fitness function of each candidate vector.For our case, we defined an initial population of  = 100 vectors (with previously described constraints) representing the channel separation coefficients we were optimizing.We optimized them through  = 200 generations.We used arithmetic crossover, random mutation (randomly adding or subtracting a small random value in range [0.01, 0.1] to a small subset of subjects) and tournament selection for creating the next generation offspring.Such a GA approach is expected to converge towards channel-separation coefficients that minimize the BP estimation error (which is our After PTT importance was confirmed, we framed the problem described in Eq. ( 2) as an optimization problem, where one determines the optimal value of coefficients with respect to some fitness function.When choosing the fitness function, we initially decided to use the average mean absolute error (MAE) of accompanying trained Random Forest regressors predicting SBP and DBP.The regressor was evaluated each time in a 5-fold cross validation (CV) experiment without shuffling (instances of same subject stayed together to avoid overfitting within the CV), using the PTTs computed with a given candidate vector.
Once the MAE of a regression model estimating BP was chosen as the fitness function, we attempted to find solutions to our optimization problem by using a genetic algorithm that trained a regression model for the computation of the fitness function of each candidate vector.For our case, we defined an initial population of n = 100 vectors (with previously described constraints) representing the channel separation coefficients we were optimizing.We optimized them through g = 200 generations.We used arithmetic crossover, random mutation (randomly adding or subtracting a small random value in range [0.01, 0.1] to a small subset of subjects) and tournament selection for creating the next generation offspring.Such a GA approach is expected to converge towards channel-separation coefficients that minimize the BP estimation error (which is our fitness function and our ultimate goal), but does not ensure convergence to a global optimum.
It also requires training a regression model with current PTTs each time a candidate vector is evaluated.This is why we decided to use a regression model that can be trained quickly while historically showing good and robust performance [28].The pseudocode is given in Algorithm 1.
Algorithm 1. GA-BP using Random Forest regressor as fitness fitness function and our ultimate goal), but does not ensure convergence to a global optimum.
It also requires training a regression model with current PTTs each time a candidate vector is evaluated.This is why we decided to use a regression model that can be trained quickly while historically showing good and robust performance [28].The pseudocode is given in Algorithm 1.The previously described algorithm has a major drawback in that it requires ground-truth BP measurements for the regression models training, which are not readily available.The evaluations are also computationally expensive, especially if the chosen regression model is complex (e.g., Support Vector Machine with poly kernel).We wanted to further generalize to achieve a solely input-data-driven approach, meaning a change in fitness function was required.
As the initial problem was rooted in the fact that PTTs are near-impossible to measure due to the mixture of traces and we wanted to separate them as much as possible, an intuitive approach was to consider maximizing the phase delay between the channel-separated traces, assuming that this preserves the per-cycle PTT information in the sequence.This can be alternatively defined as minimization of the cross-correlation between the traces.The cross-correlation between signals  and  at different lags (phase delays) is computed using the MATLAB implementation in the function  (, ).This returns correlations between the two signals at different possible delays or lags.The phase delay at which the correlation is lowest is taken.Since the signals are periodic, it is enough to check only  delays, where  is the average cycle length of a specific recording.The Algorithm 1 thus gets modified as given in 2.
The proposed GA-PD Algorithm 2 substantially and importantly lessens the input requirements as it does not require the ground-truth BP to train a regression model for the fitness function computation.It also omits the potentially high time complexity of the regression model training.
For a simple model, like Random Forest, the time complexity is  ( • ()) [37], but this can increase dramatically for more complex models like SVM, reaching  ( 2 ) or even  ( 3 ) depending on the kernel [38].

PTT Measurement
In the next step, the reference points were detected.We chose to use max systolic slope point as the reference point, as it is generally more stable compared to systolic peak detection, as the latter is more prone to missdetections due to (movement) noise.Once the reference points were 5.2.4.Generalized genetic algorithm using phase delay (GA-PD) The previously described algorithm has a major drawback in that it requires ground-truth BP measurements for the regression models training, which are not readily available.The evaluations are also computationally expensive, especially if the chosen regression model is complex (e.g., Support Vector Machine with poly kernel).We wanted to further generalize to achieve a solely input-data-driven approach, meaning a change in fitness function was required.
As the initia problem was rooted in the fact that PTTs are near-impossible to measure due to the mixture of traces and we wanted to separate them as much as possible, an intuitive approach was to consider maximizing the phase delay between the channel-separated traces, assuming that this preserves the per-cycle PTT information in the sequence.This can be alternatively defined as minimization of the cross-correlation between the traces.The cross-correlation between signals x and y at different lags (phase delays) is computed using the MATLAB implementation in the function xcorr(x, y).This returns correlations between the two signals at different possible delays or lags.The phase delay at which the correlation is lowest is taken.Since the signals are periodic, it is enough to check only L delays, where L is the average cycle length of a specific recording.The Algorithm 2 thus gets modified as given in 2.
The proposed GA-PD Algorithm 2 substantially and importantly lessens the input requirements as it does not require the ground-truth BP to train a regression model for the fitness function computation.It also omits the potentially high time complexity of the regression model training.For a simple model, like Random Forest, the time complexity is O(n • log(n)) [37], but this can increase dramatically for more complex models like SVM, reaching O(n 2 ) or even O(n 3 ) depending on the kernel [38].

PTT measurement
In the next step, the reference points were detected.We chose to use max systolic slope point as the reference point, as it is generally more stable compared to systolic peak detection, as the latter is more prone to missdetections due to (movement) noise.Once the reference points were detected, the sets of these points belonging to the same cardiac cycle were determined.A simple detected, the sets of these points belonging to the same cardiac cycle were determined.A simple thresholding method was used in which the reference point of the green trace was selected as the basis (green is the most commonly used part of the spectrum and one that exhibits good pulsatility and robustness) and a small neighbourhood of 200 miliseconds around its location was investigated in the other traces.If reference points were found, these were put into a set belonging to the same cardiac cycle.Finally, the average of PTTs between these individual points was computed for each recording, and used as a fundamental feature in subsequent training of a regression model for BP prediction.Specifically, the PTT was computed between the NIR and green trace only, because blue trace exhibits more noise and less pulsatility, as is expected given lesser perfusion [33].Executing this pipeline without the channel separation step yields poor results, which are observed both numerically and visually, as shown in Table 3 and Figure 5 respectively.
It turns out that without using channel separation, the PTTs between reference points in different channels are often 0, meaning that the effects of channel overlap are so severe that informative PTT cannot be computed.Furthermore, the assumption based on physiology and related work [7] is that the deeper NIR reference point should be detected before the shallower green one.Without channel separation, this is not observed consistently, as the reference point of the shallower green pulse can sometimes be detected before the deeper NIR one, resulting in negative PTTs.

Results
We used the collected data described in Section 4 and the pipeline shown in Figure 3 to compare the performance of the proposed channel separation algorithms against the physicsbased approach and blind source separation methods, as well as against the baseline of using no channel separation.The physics-based approach gives the near-ground-truth (with some simplifications discussed later) channel separation coefficients for our specific recording setup, so we first checked the discrepancy between these.We then computed the final average MAEs for SBP and DBP prediction when evaluating a Random Forest model (fixed hyperparameters) in a LOSO experiment with and without personalization.This experiment was chosen as it avoids overfitting or rather its personalization can be fully and clearly controlled.In the case without personalization, the model was trained on all subjects except the one it was tested on.
It was shown in related work that it is infeasible to train a generalized BP prediction model for general population, meaning that calibration/personalization should be considered to substantially improve the performance [6].We thus investigated personalization by adding two instances of the left-out subject (one with elevated BP, one with resting BP) to the training data to personalize the model in each iteration of the LOSO experiment.We compared MAEs computed in these Fig. 5. Exhibited positive effects of channel separation that enables PTT computation between different wavelengths probing different skin layers [28].Prominent effects can be observed in the left subplot between the dashed colored and dashed black line.thresholding method was used in which the reference point of the green trace was selected as the basis (green is the most commonly used part of the spectrum and one that exhibits good pulsatility and robustness) and a small neighbourhood of 200 miliseconds around its location was investigated in the other traces.If reference points were found, these were put into a set belonging to the same cardiac cycle.Finally, the average of PTTs between these individual points was computed for each recording, and used as a fundamental feature in subsequent training of a regression model for BP prediction.Specifically, the PTT was computed between the NIR and green trace only, because blue trace exhibits more noise and less pulsatility, as is expected given lesser perfusion [33].Executing this pipeline without the channel separation step yields poor results, which are observed both numerically and visually, as shown in Table 3 and Fig. 5 respectively.
It turns out that without using channel separation, the PTTs between reference points in different channels are often 0, meaning that the effects of channel overlap are so severe that informative PTT cannot be computed.Furthermore, the assumption based on physiology and related work [7] is that the deeper NIR reference point should be detected before the shallower green one.Without channel separation, this is not observed consistently, as the reference point of the shallower green pulse can sometimes be detected before the deeper NIR one, resulting in negative PTTs.

Results
We used the collected data described in Section 4 and the pipeline shown in Fig. 3 to compare the performance of the proposed channel separation algorithms against the physics-based approach and blind source separation methods, as well as against the baseline of using no channel separation.The physics-based approach gives the near-ground-truth (with some simplifications discussed later) channel separation coefficients for our specific recording setup, so we first checked the discrepancy between these.We then computed the final average MAEs for SBP and DBP prediction when evaluating a Random Forest model (fixed hyperparameters) in a LOSO experiment with and without personalization.This experiment was chosen as it avoids overfitting or rather its personalization can be fully and clearly controlled.In the case without personalization, the model was trained on all subjects except the one it was tested on.It was shown in related work that it is infeasible to train a generalized BP prediction model for general population, meaning that calibration/personalization should be considered to substantially improve the performance [6].We thus investigated personalization by adding two instances of the left-out subject (one with elevated BP, one with resting BP) to the training data to personalize the model in each iteration of the LOSO experiment.We compared MAEs computed in these two final experiments when using the channel separation coefficients from the physics based approach, GA and GA-PD, against the blind source separation methods PCA and ICA and also against the baseline of no channel separation.
The comparison between the channel separation coefficients obtained with each proposed algorithm (outside of PCA and ICA, which do not return coefficients but traces directly) is given in Table 2 and the comparison of the final MAEs achieved using PTTs computed with those coefficients is given in Table 3.
The channel separation coefficients of the physics-based approach [28] used a couple of simplifications, such as the response of the NIR channel in other wavelengths being truncated to zero, and the response of all channels in the NIR band (800-900 nm) being equalized.The GA and GA-PD did not use these assumptions and thus returned non-zero values for a 2 and a 3 as seen in Table 2. Other coefficients were similar to those obtained with the physics-based approach, with a slight discrepancyb 1 and b 3 differed by 0.06 and 0.045 on average respectively, which is a 5.3% overall difference.These coefficient changes also reflected in decreases of MAEs, as the performance of both GA and GA-PD were better compared to the physics-based approach as seen in Table 3.The lowest errors were achieved by the GA algorithm with an accompanying regression model trained each time.This makes sense as the fitness function being minimized is directly connected to the final evaluation metric, but comes at a computational cost and additional requirements of having ground-truth BP measurements, which are not trivial to obtain in real-world applications.The performance of the GA-PD follows closely, having on average less than 0.5 mmHg larger MAE, but has lower requirements and is thus more suitable for practical use.The performance of ICA and PCA lags behind those of GA and GA-PD, which is again not unexpected given the fact that blind source separation methods do not use prior knowledge or a model about the signal being demixed.Furthermore, they rely on independence and large amplitude differences between source signals, which are not present in our case.These methods are more likely to separate the pulsatile signal (rPPG) from the diffuse reflections and potential balistocardiographic movement signal (larger amplitude, but not always present), thus being better suited to denoising rather than obtaining the desired channel separation and phase shift that enables accurate PTT measurement.Despite this, all the proposed channel-separation algorithms outperform the baseline of using no channel separation.This is again expected, as the PTT computation without channel separation is based on only a few very subtle per-cycle PTTs, which are physiologically much less informative.We also observed consistently better performance when training a personalized model, as the subject-specific instances of rest and activity scenario helped the model calibrate to a specific subject.This in line with previous work [6,32] and confirmed the difficulty of training a robust general BP estimation model.This can additionally be seen in Figs. 6, where Bland-Altman plots are shown for the best-performing channel separation algorithm (GA-BP), both for a general and personalized regressor.An important observation is that predictive performance does not degrade between rest and activity scenario, which have substantially different BPs.The errors are fairly normally distributed regardless of the scenario.This is a known problem in literature dealing with BP prediction, computation without channel separation is based on only a few very subtle per-cycle PTTs, which are physiologically much less informative.We also observed consistently better performance when training a personalized model, as the subject-specific instances of rest and activity scenario helped the model calibrate to a specific subject.This in line with previous work [6,32] and confirmed the difficulty of training a robust general BP estimation model.This can additionally be seen in Figures 6, where Bland-Altman plots are shown for the best-performing channel separation algorithm (GA-BP), both for a general and personalized regressor.where vast majority of cases is centered around the mean value, meaning that the predictor achieves low numerical errors, but is expected to only work well in typical or normal BP ranges [6,32].We observed consistent performance in our experiments, showing that the approach is robust across a broad range of BP values and hemodynamic states.
There exist a number of standards for clinical BP monitoring devices.Two widely used ones are the Association for the Advancement of Medical Instrumentation (AAMI) standard, which is commonly used in the U.S. for the performance and accuracy of blood pressure monitors, and the British Hypertension Society (BHS) standard, which is more common in Europe.These standards focus on mean errors and standard deviation of errors to give devices corresponding grades.Mean errors themselves without reported deviations can be misleading and exhibit very low numbers.In our case we did not report them explicitly due to better intuitive meaning of MAEs, however, our results indicate that the calibrated personalized model meets the AAMI SP10 requirements for both SBP and DBP, while the general model does not.Furthermore, in terms of the BHS standard, the personalized model results would be placed in the A grade, while the general model would be borderline A grade for DBP and C grade for SBP.It is important to note that the SBP estimation is more difficult and important, since variation is greater, and SBP is the main early indicator of hypertension.
It is also worth noting that while the aim of traditional rPPG algorithms [34] is to use different channels and their fusion to reconstruct a single trace with high signal-to-noise ratio (SNR), our methods instead focus on maximizing the phase delay while preserving the important temporal physiological information in the form of PTT.The former is often more useful for HR estimation, which relies on clear high-quality waveform, while the latter is better suited for BP estimation via the known relationship between PWV, PTT and BP.

Conclusion
We proposed novel data-driven generalized channel separation approaches based on genetic algorithms, to be used in contact-free remote PTT measurement and BP estimation.Using our proposed validation framework and data of 13 subjects collected during varying hemodynamic states, we confirmed that the proposed algorithms allow for measurement of precise PTT that shows much higher feature importance compared to morphological and demographic features.Such informative PTT in turn allows for accurate SBP and DBP estimation.Importantly, these algorithms for channel separation enable novel contact-free and continuous illumination MW PTT measurement.The proposed algorithms showed similar performance, substantially surpassing the baseline approaches of not using any such separation or using blind source separation without an underlying model, however each comes with its pros and cons: 1. Physics-based approach [28]: The advantage is that it is based on the precise quantum efficiency and spectral signature of the image sensor and light source, and thus allows for a clear connection between the underlying physics and the computed coefficients for channel separation (no black-box steps).The downside is its dependance of such precise input data, which are not trivial to obtain -each image sensor has its specific quantum efficiency and each light source a specific spectrum that must be measured using specialized and expensive hardware.

Genetic algorithm with a BP regression model (GA):
The main advantage is that it is general regardless of hardware (camera and light source) specifics of a recording setup.It also achieves the lowest MAEs for BP prediction from channel separated PTT.The main downside is the fact that it requires varied ground-truth BP measurements alongside input data to train a BP regression model each time, which increases its computational complexity and lowers wide-use potential.
3. Generalized genetic algorithm using cross-correlation (GA-PD): This algorithm directly resolves the main disadvantage of GA by not requiring ground-truth BP measurements while still being completely data-driven and hardware-independent. On the other hand it achieves a slightly worse performance in terms of BP MAEs compared to GA, but still better than the physics-based and substantially better than the no-channel-separation baseline.Furthermore, the approach can be used to measure PTT as a standalone parameter describing the arterial stiffness, which is also an indicator of hypertension [39].

Limitations and future work
The ultimate goal is application of contact-free single-site BP measurement in real-world settings, but the methods were so far only validated in a controlled lab setting.It still faces many challenges before robust and reliable wide-spread application.The signals are very subtle and extremely sensitive to any distortions that might make reference point computation imprecise.Any movement is thus a challenge, which can be only partially resolved with preprocessing, especially if the noise is severe.Optical flow tracking algorithms like Lucas-Kanade could be used to stabilize the ROI.Preprocessing must also ensure no per-cycle phase shifts of the reference points (e.g., filtering must not distort the systolic rise and systolic peak location) so the preprocessing algorithms influencing the signal (e.g., filtering) must be chosen carefully.rPPG work relying on face as the region of interest is often limited by different skin tones, make up (when using face), etc. Palm offers an overall more robust measurement site, exhibiting good balance between skin thickness and perfusion, more consistent skin exposure (no make up, no hair, no glasses, etc.) and consistently low melanin content.Using the palm also offers better real-world implementation options compared to a face, for instance a closed box where hand is inserted.On the other hand, heavy sweating of the palm (nervousness, heat) is a challenge due to specular reflections, which must be considered in the context of polarizer usage, which further decreases the amount of light reaching the image sensor.Finally, the proposed approach is semi-black-box (pre-defined color model used) in the computation of channel separation coefficients and does not inherently ensure convergence to a global optimum.End-to-end neural networks (CNN + LSTM or 3DCNN) will be considered as an alternative given that there is potential for improved performance, which can then potentially be linked to an explanation via explainability mechanisms. Funding.

Fig. 2 .
Fig.2.Layered skin structure and the principle of the proposed multi-wavelength remote PTT monitoring compared with traditional multi-site PTT[28].

Fig. 2 .
Fig. 2. Layered skin structure and the principle of the proposed multi-wavelength remote PTT monitoring compared with traditional multi-site PTT [28].

Fig. 4 .
Fig. 4. Comparison of average feature importances obtained from the Random Forest regressors for BP prediction.Green box denotes the most important feature, while orange and red boxes denote features with low to very-low importance.

Fig. 4 .
Fig. 4. Comparison of average feature importances obtained from the Random Forest regressors for BP prediction.Green box denotes the most important feature, while orange and red boxes denote features with low to very-low importance.

Fig. 5 .
Fig. 5. Exhibited positive effects of channel separation that enables PTT computation between different wavelengths probing different skin layers [28].Prominent effects can be observed in the left subplot between the dashed colored and dashed black line.

Fig. 6 .
Fig. 6.Bland-Altman plots after GA-BP channel separation for systolic (first row) and diastolic (second row) BP.Columns represent training a general or personalized regression model.

Fig. 6 .
Fig. 6.Bland-Altman plots after GA-BP channel separation for systolic (first row) and diastolic (second row) BP.Columns represent training a general or personalized regression model.

Table 3 . Comparison of the final MAEs in mmHg for SBP and DBP estimation when using different channel separation algorithms. We compare against the baseline of using no channel separation. We report results for experiments with and without personalization in the final LOSO experiment.
General Program of National Natural Science Foundation of China (62271241); National Key Research and Development Program of China (2022YFC2407800); Javna Agencija za Raziskovalno Dejavnost RS.