Using imaging photoplethysmography for heart rate estimation in non-human primates

For humans and for non-human primates heart rate is a reliable indicator of an individual’s current physiological state, with applications ranging from health checks to experimental studies of cognitive and emotional state. In humans, changes in the optical properties of the skin tissue correlated with cardiac cycles (imaging photoplethysmogram, iPPG) allow non-contact estimation of heart rate by its proxy, pulse rate. Yet, there is no established simple and non-invasive technique for pulse rate measurements in awake and behaving animals. Using iPPG, we here demonstrate that pulse rate in rhesus monkeys can be accurately estimated from facial videos. We computed iPPGs from eight color facial videos of four awake head-stabilized rhesus monkeys. Pulse rate estimated from iPPGs was in good agreement with reference data from a contact pulse-oximeter: the error of pulse rate estimation was below 5% of the individual average pulse rate in 83% of the epochs; the error was below 10% for 98% of the epochs. We conclude that iPPG allows non-invasive and non-contact estimation of pulse rate in non-human primates, which is useful for physiological studies and can be used toward welfare-assessment of non-human primates in research.


Introduction
Heart rate is an important indicator of the functional status and psycho-emotional state for humans, non-human primates (NHP) [1][2][3] and other animals [4,5]. A common and the most direct tool for measuring heart rate is the electrocardiogram (ECG) [6]. ECG acquisition requires application of several cutaneous electrodes, which is not always possible even for human subjects [7]. ECG acquisition in NHP is complicated by the fact that conventional ECG electrodes require shaved skin and that awake animals do not easily tolerate skin-attachments unless extensively trained to do so. Hence, ECG in NHPs is either collected from sedated animals [6,8,9] or by implanting a telemetry device in combination with highly invasive intracorporal sensors [10][11][12]. For a non-invasive ECG acquisition from non-sedated monkeys a wearable jacket can be used [1,13,14], but this also requires extensive training and physical contact when preparing the animal, which may affect the physiological state. Table 1. Subjects data and illumination conditions of experimental sessions. For illumination we used either fluorescent lamps mounted at the room ceiling (Philips Master Tl-D 58W/840, 4000K cold white) or setup halogen lamps (Philips Brilliantline 20W 12V 36D 4000K). Luminance was measured by luminance meter LS-100 (Minolta) with close-up lens No 135 aiming at the ridge of the monkey nose (settings: calibration preset; measuring mode abs.; response slow; measurement distance % 55 cm). (Niedersaechsisches Landesamt fuer Verbraucherschutz und Lebensmittelsicherheit (LAVES)). The animals were pair-or group-housed in facilities of the German Primate Center (DPZ) in accordance with all applicable German and European regulations. The facility provides the animals with an enriched environment (incl. a multitude of toys and wooden structures [32,33]), natural as well as artificial light and access to outdoor space, exceeding the size requirements of European regulations. The animals' psychological and veterinary welfare is monitored daily by the DPZ's veterinarians, the animal facility staff and the lab's scientists (see also [34]). During the video recordings, the animals were head-stabilized and sat in the primate chair in their familiar experimental room, facing the camera. Recording sessions were performed during the preparation phase for the cognitive task. Animals were not rewarded during the video recording to minimize facial motion from sucking, chewing or swallowing.

Materials and set-up
Our study of pulse rate estimation consisted of ten experimental sessions. Video acquisition parameters for Sessions 1-8 where RGB video was recorded are provided in Table 2. We have tried several acquisition parameters to see whether they affect the iPPG quality, but did not observe any considerable difference. Most methods and results are reported for the RGB video. In addition, monochrome near-infrared (NIR) video was acquired during Session 2 (simultaneously with RGB) and in Sessions 9 and 10, as a control to compare pulse rate estimation from NIR and RGB video, see Section Estimation of pulse rate from near-infrared video for details. All videos were acquired at ambient (non-dedicated) light, see Table 1 for details. We used different illumination conditions since setting a particular illumination is often not possible in everyday situations, and we wanted our video-based approach to be able to cope with this variability.
Facial videos of rhesus monkeys were processed to compute iPPG signals. Fig 1 illustrates the main steps of iPPG extraction, which we will consider below in detail: selection of pixels that might contain pulse-related information is described in Section Selecting and refining the region of interest, computation and processing of iPPG is considered in Section Extraction and processing of imaging photoplethysmogram.
To estimate pulse rate from iPPG (videoPR) we computed the discrete Fourier transform (DFT) of iPPG signal in sliding overlapping windows; for each window pulse rate is estimated by the frequency with highest amplitude in the heart-rate bandwidth, which is 90-300 BPM (1.5-5 Hz) for rhesus monkeys [2, [35][36][37]. This approach is commonly used for human iPPG  Table 2. A larger window leads to poor temporal resolution, while a smaller window results in crude frequency resolution (! 4 BPM). Video processing, ROI selection and computation of color signals are implemented in C++ using OpenCV (opencv.org). iPPG extraction and processing, as well as pulse rate estimation, are implemented in Matlab 2016a (www.mathworks.com).
Reference pulse rate (refPR) values were obtained by a pulse-oximeter CAPNOX (Medlab GmbH, Stutensee) using a probe attached to the monkey's ear. This pulse-oximeter does not provide a data recording option, but it detects pulses and displays on the screen a value of pulse rate computed as an inverse of the average inter-pulse interval over the last 5 s. The pulse-oximeter measures pulse rate within the range 30-250 BPM with a documented accuracy of ± 3 BPM. The pulse-oximeter screen was recorded simultaneously with the monkeys' facial video and pulse rate was extracted off-line from the video.
To compare videoPR with refPR, we used epochs of length equal to the length of DFT windows with 50% overlap (see Table 2). For this we averaged the refPR values over the epochs. Note that the quality of the raw iPPG signal was low compared to the contact PPG acquired by the pulse-oximeter. Therefore we were unable to directly detect pulses in the signal, which would provide better temporal resolution (see Section Pulse rate estimation for details).
Selecting and refining the region of interest. To reduce spatially uncorrelated noise and enhance the pulsatile signal, we averaged values of color channels over the ROI [21,38]. In order to select pixels containing maximal amount of pulsatile information, we used the following three-step algorithm:  Step 1. A rectangular boundary for the ROI was selected manually for the first frame of the video. Since no prominent motion was expected from a head-stabilized monkey, this boundary remained the same for the whole video. We tried six heuristic regions shown in Fig 2; the best iPPG extraction for most sessions was achieved for the noseand-cheeks region 3 (Fig 3).
Step 2. Since hair, eyes, teeth, etc. provide no pulsatile information and deteriorate quality of the acquired iPPG, only skin pixels should constitute the ROI. To distinguish between skin and non-skin pixels, we transformed frames from the RGB to the HSV (Hue-Saturation-Value) color model as recommended in [39,40]. We excluded all pixels having either H, S or V value outside of a specified range. Three HSV ranges describing monkey skin for different illumination conditions (Table 1) were selected by manual adjustment under visual control of the resulting pixel area, see Table 3.
Step 3. In addition, for each frame we excluded all outlier pixels that differed significantly from other pixels in the ROI. The aim of this step was to eliminate pixels corrupted by artifacts. Namely, we excluded pixel (i, j) in the k-th frame if for it the value of any color channel c i;j k did not satisfy the inequality where m k and σ k are the mean and standard deviation of color channel c for pixels included in the ROI of the k-th frame at steps 1-2; the coefficient 1.5 was selected based on our previous work [41]. were computed as averages for each color channel over the ROI obtained by refining the noseand-cheeks region 3 (Fig 3) for every frame k. Then, prior to iPPG extraction, we mean-centered and scaled each color signal c 0 k to make them independent of the light source brightness level and spectrum, which is a standard procedure in iPPG analysis [22,24]: where m k,M is an M-point running mean of color signal c 0 k : For k < M we use m 1,M . We followed [24] in taking M corresponding to 1 s. The frequency corresponding to the refPR value and its second harmonic are indicated by dashed and dotted lines, respectively. Although the peak corresponding to the pulse rate is most prominent for region 3, it is distinguishable for other regions as well.
https://doi.org/10.1371/journal.pone.0202581.g003 For iPPG extraction we used the "green-red difference" (G-R) method [42], which is simple and effective for computing human iPPG [24,41]. This method is based on the assumption that green color signal carries maximal amount of pulsatile information, while red color signal contains little relevant information but allows to compensate those artifacts common for both color channels (see Section Imaging photoplethysmography under visible and near-infrared light for a discussion of pulsatile information provided by different colors). Thus iPPG was computed as a difference of green and red color signals: where g k and r k are mean-centered and scaled green and red color signals, respectively.
We have also employed two other methods for iPPG extraction, CHROM [22] and POS [24], that are most effective for computing human iPPG [24,41]. POS computes iPPG as a combination of color signals: where s x k;L and s y k;L are L-point running standard deviations of x k = g k − b k and y k = g k + b k − 2r k , respectively.
CHROM combines color signals in a slightly different way: where s x k;L and s y k;L are L-point running standard deviations of x k = 0.77r k − 0.51g k and y k = 0.77r k + 0.51g k − 0.77b k . For both methods we took L corresponding to 1 s so that the time window captured at least one cardiac cycle as recommended in [24].
Note that our implementation of POS and CHROM methods is slightly different from the original in [22,24]: to ensure signal smoothness, we compute running means and standard deviations instead of computing segments of iPPG in several overlapped windows and then gluing them by the overlap-add procedure described in [22]. Processing of iPPG signal for POS and CHROM methods was the same as for G-R method.
Since we used low-cost cameras, we expected a poor quality of the raw iPPG signal compared to contact PPG. Therefore we post-processed iPPG using three following steps, typical for iPPG signal processing [22,25,42,43].
Step 1. We suppressed frequency components outside of the heart-rate bandwidth 90-300 BPM (1.5-5 Hz) using finite impulse response filter of the 127th order with a Hamming window as suggested in [25].
Step 2. We suppressed outliers in the iPPG signal by cropping amplitude of narrow high peaks that could deteriorate performance on the next step: where the coefficient 3 was selected based on the classical three-sigma rule, m k,W and σ k,W are W-point running mean and standard deviation of the iPPG signal respectively: We took W equal to 2 s to have estimates of mean and standard deviation over several cardiac cycles.
Step 3. We applied a wavelet filtering proposed in [44] to suppress secondary frequency components. This filtering consists of two steps: first a wide Gaussian window suppresses frequency components remote from the frequency corresponding to maximum of average power over 30 s (global filtering for suppressing side bands that could cause ambiguities in local filtering). Then a narrow Gaussian window is applied for each individual time-point's maximum (local filtering to emphasize the "true" maximum).
As suggested in the original article, we employed Morlet wavelets. We used scaling factors 2 and 5 for global and local filtering, respectively (see [44] for details), since these parameters provide best pulse rate estimation in our case. We implemented wavelet filtering using Matlab functions cwtft/icwtft from Wavelet Toolbox.
To demonstrate the effect of the multi-step iPPG processing, we show in Fig 4 how each processing step magnifies the pulse-related component.

Estimation of pulse rate from near-infrared video.
To check the possibility of iPPG extraction from monochrome near-infrared (NIR) video in NHP, we recorded video using the Kinect NIR sensor during Session 2. Additionally, we conducted Sessions 9 and 10 (see Table 1) using the monochrome camera Chameleon 3 U3-13Y3M (FLIR Systems, OR, USA) with ultraviolet/visual cut-off filter R-72 (Edmund Optics, NJ, USA) blocking light with wavelengths below 680 nm. Details of the recorded video are presented in Table 4. Extraction of iPPG from monochrome NIR video was similar to that from RGB video: we computed raw iPPG as average intensity over all pixels in a ROI (monochrome NIR video provides a single intensity channel, so we could not use G-R method in this case) and then processed the signal as described in Section Extraction and processing of imaging photoplethysmogram. For ROI selection we determined skin pixels as having intensity within the range specified in Table 4.

Data analysis
To access the quality of pulse rate estimation, we compared iPPG-based pulse rate estimates (videoPR) with the reference pulse rate estimates by contact pulse oximetry (refPR). For each session we computed the mean absolute error (as the absolute difference between videoPR and refPR, averaged across all epochs) and the Pearson correlation coefficient. To facilitate the interpretation of the results we also present percentages of epochs with estimation error in a certain range.
Additionally, in Section Effects of motion on iPPG quality, we assessed the quality of iPPG by computing the signal-to-noise ratio as suggested in [22] (we considered frequency components as contributing to signal if they are in range [refPR − 9BPM, refPR + 9BPM] or [2 Á refPR − 18BPM, 2 Á refPR + 18BPM]). The signal-to-noise ratio is a better indicator of iPPG quality than an error of pulse rate estimation, since pulse rate can be estimated correctly even from a low-quality iPPG. In the same section we estimated the amount of motion that may affect iPPG acquisition by averaging over time mean squared differences between successive frames: where K is the number of frames, W and H are the width and length of the region of interest,

Main result: Estimation of pulse rate from RGB video
To check the usefulness of the iPPG signal for pulse rate estimation in rhesus monkeys, we computed several quality metrics characterizing values of pulse rate derived from the video (videoPR) in comparison with the reference pulse rate from the pulse-oximeter (refPR).
The values of quality metrics presented in Table 5 show that pulse rate estimation from iPPG was successful. Bearing in mind that accuracy of refPR is ± 3 BPM, obtained values of mean absolute error for videoPR are rather good. Altogether, for 80% of epochs error of pulse rate estimation was below 7 BPM (which is 5% of rhesus monkey average heart rate 140 BPM), Imaging photoplethysmography in non-human primates and for 83% of epochs error of pulse rate estimation was below 5% of refPR value. The low correlation between videoPR and refPR for Session 4 is explained by two outlier data-points for which refPR is above average while videoPR is well below average (estimation errors for these epochs are 18 and 11 BPM); after excluding these two points the correlation for Session 4 was 0.58. Fig 5 shows the Bland-Altman plots for all sessions (in four monkeys). One can see that for most epochs the error of pulse rate estimation was low; it was slightly higher for high pulse rate. Fig 6 shows power spectrograms of the processed iPPG and average values of videoPR for Sessions 1 and 5 in comparison with the reference pulse rate from the pulse-oximeter (refPR). For both sessions, videoPR allows to track the pulse rate changes comparably to the data from pulse-oximeter (see also S1 Fig).

Multi-step iPPG processing results in better pulse rate estimation
In this study we used a multi-step procedure for iPPG signal processing. To demonstrate that all the steps are important for the quality of pulse rate estimation, we estimated pulse rate omitting certain processing steps. Average quality metrics for these estimates are presented in Table 6, as one can see only a combination of filters improves the pulse rate estimation.

Comparison of methods for iPPG extraction
All results reported so far were obtained using G-R method [42] for iPPG extraction. We have additionally estimated pulse rate from the iPPG signal extracted by CHROM [22] and POS [24] methods (iPPG processing was the same as for G-R method). Fig 7 shows that performance of G-R and POS methods was rather similar. Worse performance of CHROM was likely due to the fact that it is based on a model of a human skin [24], thus additional research is required to adjust it for iPPG extraction in NHP.

Effects of motion on iPPG quality
Although we have considered here iPPG extraction for head-stabilized monkeys, our subjects exhibited a considerable number of facial movements. Fig 8 illustrates their negative impact on iPPG extraction quality: the more prominent are movements during an epoch, the lower is quality of extracted iPPG.

Pulse rate estimation from near-infrared video was not successful
As indicated by the quality metrics in Table 7, pulse rate estimation from near-infrared (NIR) video was not successful. The low quality of estimation reflects that pulse rate had most of the time lower power than other frequencies related to movements, lighting variations, other artifacts or noise (see Fig 9).

Imaging photoplethysmography under visible and near-infrared light
Quality of the photoplethysmogram (PPG) acquisition in general strongly depends on the light wavelength. The PPG signal is generated by variations of the reflected light intensity due to changes in scattering and absorption of skin tissue. In the ideal case, light would be only absorbed by blood haemoglobin, which would make PPG a perfect indicator of the blood volume changes [16,48]. However, light is also absorbed and scattered by water, melanin, skin pigments and other substances [16]. There is no general consensus about the optimal wavelength for PPG acquisition, but the trend is in favor of employing visible light [18,49]. Although traditionally PPG was acquired at near-infrared (NIR) wavelengths [15,16], experiments have shown that the light from the Imaging photoplethysmography in non-human primates visible spectral range allows to acquire comparably accurate PPG [48,50,51] or to even reach higher accuracy [52,53]. Theoretical studies and simulations in [42] indicate that the best signal-to-noise ratio in iPPG should be obtained for wavelengths in the ranges of 420-455 and 525-585 nm, with peaks at 430, 540 and 580 nm corresponding to violet, green and yellow light, respectively (see [42,Section 3.5] for details). Passing through epidermis and bloodless skin is optimal for green (510-570 nm), red (710-770 nm) and NIR light (770-1400 nm) [54,55]. However, absorption of haemoglobin is maximal for the green and yellow (570-590 nm) light [52,56,57]. This results in a better signal-to-noise ratio for these wavelengths [18, 49,52].
Our results show that monochrome NIR video acquired with single non-specialized cameras and without dedicated illumination was not suitable for iPPG extraction, while RGB visible-light video was suitable. The striking difference between the results is not altogether surprising. Combining several color channels when extracting iPPG from RGB video, is more effective than considering data from a single wavelength [21,22]. Specifically, it makes iPPG extraction from RGB video less sensitive to movements and lighting variations, while for NIR  Imaging photoplethysmography in non-human primates  5 (B). We assessed quality of iPPG by signal-to-noise ratio and estimated amount of motion by averaged interframe difference, see Section Data analysis. In epochs with more motion, signal-to-noise ratio decreases, which reflects the adverse influence of motion on iPPG signal quality. Note the inverted y-axis for motion (averaged interframe difference).
https://doi.org/10.1371/journal.pone.0202581.g008 Imaging photoplethysmography in non-human primates video to compensate them simultaneous video acquisition from several cameras is typically used [58,59]. Additional research is however required to check whether visible or NIR light is generally preferable for iPPG extraction in NHP. In particular, two following obstacles may hinder accurate iPPG extraction from RGB video of NHP faces under visible light: • High melanin concentration. Melanin strongly absorbs visible light with wavelengths below 600 nm [16], which degrades the quality of PPG for humans with a high melanin concentration [43,47]. This often motivated using red or NIR light for PPG acquisition [16, 17], but some modern methods for iPPG extraction successfully alleviate this problem [47]. To the best of our knowledge, melanin concentration in monkey facial skin was not compared to that of humans, and the question of melanin concentration influence on iPPG extraction in NHP remains open.
• Insufficient light penetration depth. For a light of wavelength λ within 380-950 nm (visible and NIR light) with fixed intensity, the higher λ is, the deeper the light penetrates the tissue [51,60]. This is important since pulsatile variations are more prominent in deeper lying blood vessels. For instance, the penetration depth of the blue light (400-495 nm) is only sufficient to reach the human capillary level [42,61], providing little pulsatile information [23]. Green light penetrates up to 1 mm below the human skin surface, which makes the reflected light sensitive to the changes in the upper blood net plexus and in the reticular dermis [42,60]. To assess blood flow in the deep blood net plexus, one uses red or NIR light with penetration depth above 2 mm [21,60]. Insufficient penetration depth of visible light may hinder iPPG acquisition from certain parts of human body [19,42], but human faces allow extraction of accurate iPPG from the green and even the blue light component [21,51]. For rhesus monkeys iPPG extraction is simplified by the fact that their facial skin has a dense superficial plexus of arteriolar capillaries [62]. It has been previously demonstrated [63] that subtle changes in facial color of rhesus monkeys caused by blood flow variations can be detected using a camera sensor. For other NHP it is not clear which wavelength if any  3). The frequencies corresponding to the refPR value and its second harmonic are indicated by dashed and dotted lines, respectively. While the peak corresponding to the pulse rate is distinguishable, the signal is strongly contaminated with noise that hampers correct pulse rate estimation. https://doi.org/10.1371/journal.pone.0202581.g009 would be sufficient to penetrate the skin and systematic studies are required to answer this question.

Video acquisition and region of interest selection
A good video sensor is necessary for successful iPPG extraction. In this study we used three different sensors: RGB and NIR sensors of a Microsoft Kinect for Xbox One and Chameleon 3 1.3 MP (color and monochrome models), details are provided in Table 8. Although all these CMOS-sensors have moderate characteristics, they are better than sensors of most general-purpose cameras used for iPPG extraction in humans (see, for instance, [25,67]). Still, specialized cameras, like those used in [23,42], could provide more robust and precise iPPG extraction in NHP. For the choice of the camera such characteristics as pixel noise level and sensitivity are crucial, since iPPG extraction implies detection of minor color changes. Besides, sufficient frame rate is required to capture heart cycles. For humans the minimal frame rate for iPPG extraction is 15 fps [25], though in most cases 30 fps and higher are used [67]. Since heart rate for rhesus monkeys is almost twice as high as for humans, higher frame rate is required. In our study frame rate of 30 fps (provided by Microsoft Kinect) was sufficient to estimate pulse rate from iPPG.
For practical applications, one needs a criterion for rejecting video frames if they contain too little reliable pulsatile information for accurate iPPG extraction. One possible solution is to reject video frames as invalid if the signal-to-noise ratio of the acquired iPPG is not above a threshold computed from ROIs that do not contain exposed skin (e.g. regions covered by hair). See also [68] for a method of rejecting invalid video based on spectral analysis.
In this paper we have only considered manual selection of ROI, which is acceptable for head-stabilized NHPs. In the general case, automatic face tracking would be of interest [69]. In this study, a cascade classifier using the Viola-Jones algorithm [70] allowed detection of monkey faces in recorded video. Furthermore, techniques for automatic selection of regions providing most pulsatile information in humans [38,46,71,72] can be adopted. This might improve quality of iPPG extraction in NHPs, since application of adaptive model-based techniques to ROI refinement results in better iPPG quality and more accurate pulse rate estimation in humans [46,72].
For non-head-stabilized NHP, robustness of the iPPG extraction algorithm to motion is especially important. For humans this problem can be successfully solved by separating pulserelated variations from image changes reflecting the motion [73], but for NHP additional research is required. In particular, hair covering parts of NHP face can provide references for successful compensation of small motion (see [42], where similar methods are proposed for humans).
An even more challenging task is iPPG extraction for freely moving NHP. In this case distance of the animal face to camera sensor and relative angles to the sensor and to the light sources are time dependent. For humans, a model taking this into account was suggested in [24], however its stable performance has been only demonstrated for relatively short (several meters) and nearly constant distances from the face to the camera. Nevertheless, recent Imaging photoplethysmography in non-human primates progress in iPPG extraction from a long distance video [67] and in compensation of the variable illumination angle and intensity [24,74] gives hope that required models might be developed in future.

Pulse rate estimation
In our study estimated pulse rates were in the range of 95-125 BPM for monkey Sun, 125-150 BPM for Fla and 160-230 BPM for Mag, which agrees with previously reported values of heart rate (120-250 BPM) for rhesus monkeys sitting in a primate chair [2, [35][36][37]. Performance of our method for pulse rate estimation (Table 5) was only slightly worse than those reported for humans: mean absolute error obtained in [75] was 2.5 BPM; fraction of epochs with error below 6 BPM (about 8% of average human pulse rate) for the best method considered in [45] was achieved for 87% of epochs; reported values of the Pearson correlation coefficient vary from 0.87 in [45] to 1.00 in [25]. Comparing our results with outcomes of human studies, one should also allow for the imperfect reference since data from pulse-oximeter are not as reliable as ECG.
In this study we used the discrete Fourier transform (DFT) for pulse rate estimation. Despite of its popularity this method is often criticized as being imprecise [19,45]). Indeed, application of DFT implies analysis of iPPG signal in a rather long time window (20-40 s), while pulse rate is non-stationary and scarcely remains constant for several heart beats in row [19]. These fluctuations blur the iPPG spectrum and may hinder pulse rate estimation.
As an alternative to using DFT, we estimated instantaneous pulse rate from inter-beat intervals defined either as difference between adjacent systolic peaks (maximums of iPPG signal) or as diastolic minima (troughs of the signal) [7]. Since the wavelet filter greatly smoothed the signal and modified its shape (see Fig 4), we excluded this processing step. We tested several methods for peak detection in PPG [76,77], but the results were inconclusive and highly dependent on the quality of iPPG signal, therefore we do not present them here. Further research is required to find suitable algorithms for estimation of inter-beat intervals from iPPG in NHP. Specifically, the choice of methods for iPPG signal processing prior to peak detection seems to be important. Note that the post-processing applied here destroys the shape of pulse waves (see Fig 4). However, in this study the quality of iPPG signal was only sufficient for a rough frequency-domain analysis, thus the shape of iPPG signal was not critical.
Another possible method for pulse rate estimation is provided by the wavelet transform [19,43], which can be naturally applied after wavelet filtering (without reconstructing the filtered iPPG signal by inverse wavelet transform).

Limitations
In this study we have acquired data from four adult male rhesus monkeys. Since our subjects were of various ages (8-16 years) and had pulse rate values from a wide range, we consider this sample size sufficient to demonstrate feasibility of iPPG extraction for pulse rate estimation in adult male rhesus monkeys. However, a study with a larger number of subjects, also including females and juveniles, might help to determine a more general applicability of this method.
Since non-invasive ECG acquisition from non-sedated monkeys requires extensive training (similar to the surface electromyography, see [78,79]), we used pulse-oximetry to obtain reference pulse rate values. However, ECG provides a more accurate reference and is less affected by motion.
Finally, here we used only relatively simple methods of iPPG extraction and processing. Meanwhile, several advanced methods were suggested recently for human iPPG, especially for ROI selection and iPPG extraction. Using these methods (probably with certain modifications for the NHP data), should enhance of iPPG extraction quality and allow iPPG extraction in NHP without head-stabilization.

Summary
Here we evaluated and documented the feasibility of imaging photoplethysmogram extraction from RGB facial video of head-stabilized macaque monkeys. Our results show that one can estimate, accurately and non-invasively, the pulse rate of awake non-human primates without special hardware for illumination and video acquisition and using only standard algorithms of signal processing. This makes imaging photoplethysmography a promising tool for non-contact remote estimation of pulse rate in non-human primates, suitable for example for behavioral and physiological studies of emotion and social cognition, as well as for the welfareassessment of animals in research.

S1 Fig. Comparison of imaging photoplethysmogram with a contact photoplethysmogram.
(A) Imaging photoplethysmogram (iPPG) aligns with contact photoplethysmogram (PPG) recorded using pulse oximeter P-OX100L (Medlab GmbH, Stutensee; documented accuracy ± 1%) for Session 8 (in addition to the basic reference pulse oximetry, which was the same as for other sessions). (B) Pulse rate estimated from this more precise PPG (refPR2) has very good agreement with the reference pulse rate refPR (mean absolute error 1.41, Pearson correlation 0.98). Notably, videoPR computed from iPPG has even better agreement with refPR2 than with refPR (mean absolute error 3.24, Pearson correlation 0.90). Comparison of imaging photoplethysmogram with a contact photoplethysmogram. (A) Imaging photoplethysmogram (iPPG) aligns with contact photoplethysmogram (PPG) recorded using pulse oximeter P-OX100L (Medlab GmbH, Stutensee; documented accuracy ± 1%) for Session 8 (in addition to the basic reference pulse oximetry, which was the same as for other sessions). (B) Pulse rate estimated from this more precise PPG (refPR2) has very good agreement with the reference pulse rate refPR used in all sessions (mean absolute error 1.28 BPM, Pearson correlation 0.99). Notably, videoPR computed from iPPG has even better agreement with refPR2 than with refPR (mean absolute error 3.24 BPM, Pearson correlation 0.90, cf. metrics of correspondence for videoPR and refPR in Table 5