Accurate targeting in robot-assisted TCM pulse diagnosis using adaptive sensor fusion

ABSTRACT


Introduction
Robot-assisted diagnosis has become an applied medical examination procedure, making use of competent proprioception sensors and autonomous robotic actuators to deliver enhanced diagnostic efficiency and enriched diagnostic information [1], [2].Accurate targeting of diagnosing location is an essential element of diagnostic robot including identification of patterns of human-body parts of interest, continuously target tracking, and adaptively communicating or controlling of robot behavior during dynamic interaction[3], [4].Due to the extensive applications in medical examination, there has been considerable interest in developing solutions to the accurate targeting [4]- [6].
Traditional Chinese Medicine (TCM) pulse diagnosis, as the key procedure of 4-diagnosis patient examination, usually collects radial pulse information at structured locations of wrist where the radial artery lies [7], [8].The positions are known as 'Cun' (Inch), 'Guan/Gwan' (Bar), and 'Chi' (Cubit) (Fig. 1(a) and 1(b)) in oriental medicine.TCM palpation is implemented with precise physical contact, thus approaching of diagnostic robot (Fig. 1(c)) to the wrist becomes a critical step of robot-assisted pulse diagnosis.Autonomous targeting including position detection, tracking and robotic control is a challenge to be solved.In this paper, a novel approach of accurate targeting using adaptive multimodal sensor fusion is proposed for robot-assisted TCM pulse diagnosis approaching from distance.While single modality sensor cannot adapt to scene change during dynamic approaching, multimodal interfaces take advantage of complementary information and offer multiple possibilities of sensory integration strategies.Furthermore, diagnostic robot strongly interacts with human, thus precise identification of human factors is required for decision making and can be achieved by applying physiological sensing.In this work, imaging photoplethysmography (iPPG) is used for providing cardiovascular features of examined subjects, especially the dynamic blood perfusion imaging at wrist area [9], [10].iPPG features are extracted from sequential recording of visual images equipped at the robotic hand for palpation.
For accurate targeting of TCM pulse diagnosis, an adaptive fusion scheme was applied to integrate multiple imaging sensors, to guide the robot interacting and approaching the desired palpation location.A robust multimodal deep learning architecture was devised.A weighting factor was introduced to enable the adaptability and helps to scale the contribution of different modalities.Based on the weighted-summation fusion hypothesis, contributions from multiple modalities can be assessed by a coherence weight according.Experiment results show that the proposed sensor fusion method offers the additional architectural flexibility to achieve the adaptive performance in palpation localization during the robotic approaching procedure.

Experimental setups
A camera was used for collecting PPG signal and wrist images of volunteers' forearm, which was placed vertically above the wrist area.The distance to wrist was controlled approximately at 5, 15, 20, 25, 30, 35 and 40cm, respectively.The resolution of each figure is 1000×1000.At each distance, for one volunteer, 20 types of hand posture were obtained.Each posture was recorded for 5 seconds including 200 images.At each recording, he TCM pulse location was assessed by the experimenter and labeled using the pixel location on the recorded image sequences.

Methodologies for PPG image processing
Extraction of iPPG information uses a two-stage strategy.At the first stage, a reference PPG is extracted by using a large area mean; in the second stage, we used this reference to extract the component with similar dynamics followed by comparison for averaged-single-period time window.

Step 1: Pre-processing of PPG signal
Prior to PPG extraction, we removed the brightness of light source and the spectrum by using meancentralization of each color signal , expressed by[11]: (1) Where is an M-point running mean of color signal : ( for k < M we use .We followed [11] in taking M corresponding to 2 s to cover at lease a single period of human heart rate.

Step 2: CHROM method for PPG extraction
Generally, "green-red difference" (G-R) method was used for PPG extraction [paper, 39], CHROM is one of commonly used G-R method, and computes PPG as a combination of color signals [11], [12].

Step 3: Signal processing for PPG reference
In order to remove signal artifacts that corrupt raw PPG signals without altering the amplitude information, we combined continuous wavelet transform (CWT), local filtering using a set of adaptive Gaussian windows, and inverse CWT transform together.The non-stationary PPG signal was convolved with a daughter wavelet , representing a scaled and shifted version of a mother wavelet [13], [14]: Where is the daughter wavelet, scaled by s and dilated by . is the referent mother wavelet.In order to remove non-relevant coefficients that belong to the cardiac frequency band, Gaussian window was centered around the maximum energy location on scale.Then, we compute inverse transform of CWT signals after application of Gaussian filter.

Step 4: Principal component analysis (PCA) of single pixels
We applied PCA to calculate PPG signal of each pixel by combining reference PPG with RGB PPGs, where the first principal component (the largest eigenvector) was regarded as the final PPG signal at the target pixel.

Step 5: Weighted amplification of PPG signal
We computed periodical averaging of pulse cycles for both of PCA data and reference PPG to obtain onecycle PPG signal.The weighted amplitude for each pixel was conducted by multiply of reference PPG and averaged one-cycle PPG result. ( Where Amp is weighted amplitude for pixel at (x, y) position; is averaged one-cycle PPG result; is the averaged periodic signal of the reference PPG; N is the number of points in one-cycle PPG.

Methodologies for adaptive fusion
To accurately targeting at the TCM pulse location, we have adopted multimodal fusion strategies, such as adaptive methodologies applied by [15], [16].Firstly, a convolutional network architecture was applied, using one of the single modalities as the input, that is either the photo image of the hand or the iPPG.Images of each gesture at each distance can be lumped as samples to train an expert.Images of each iPPG result at certain distance can be lumped as samples to train a second expert.In this work, an adaptive fusion network architecture was developed (Fig. 2), by concatenating single experts at featured stages.The overall architecture is an extension of the parallel combination of each individual expert, to train adaptive fusion model in end-to-end manner.The fusion parameters adapt according to the input patterns.A proposed single expert architecture used in this work is LeNet5.At the full connection layer, projecting weights acts as the gate, mapping outputs of the experts to a probabilistic mix. Figure 2. A proposed fusion network architecture.This framework provides the parallel concatenate mixture of experts.Individual expert architecture can be put in before the mixing layer, and the full connection layer determines the contribution of input1 and input2.

PPG imaging for detection artery distribution
The method to obtain iPPG signal was illustrated in experimental setup.Fig. 3 (a) shows typical imaging photoplethysmography (iPPG) signal of left hand overlapped to raw figure taken by high-resolution camera.Here, we clearly observe two significant area with high PPG intensity.which are enclosed by blue and black lines, respectively.
It is well-known that capillary in palm and radial artery of wrist, shallowly bedded under the skin, localize at those positions.The physiological sensing technique shows that iPPG signal can be used to determine the radial artery position of wrist when the hand lies on the desk [17].As the intensity of iPPG signal is large enough, we could assume that iPPG signal can be used for checking radical artery pulse position even when rotation of wrist happens.
To verify the assumption, more experimental studies were conducted.Various gestures of hand under the camera were devised in this experiment, and results are shown in Fig. 3(b-e) with slight rotation of wrist.Clear radial artery of wrist and capillary at palm can be observed in iPPG signal.Figure 3.Typical Imaging Photoplethysmography of Left Hand.Images were obtained with the vertical distance of approximate 15cm between left hand and camera.We overlapped raw figure taken by camera with calculated imaging PPG, the intensity of PPG was shown by color with range of 20 (blue) to 150 (yellow) (a.u.).Area within blue broken line denotes radial artery at wrist and area within black broken line shows capillary at palm.

Adaptive sensor fusion for radial pulse localization
We apply individual expert CNN models and the fusion expert CNN model, to fit for TCM pulse location at the wrist.Those labels are the pixel index of a 5×5 down sampled image in shown as Fig. 4(a, c, e, g).Fig. 4(b, d, f, h) illustrate some samples of human forearm (the original figure on top), along with its iPPG features (the iPPG results at bottom).As expected, the fuse model out performs that of single experts (Table1).For the fuse model, we evaluate the contribution of each input modality, by calculating the coherence of the result using the fuse model with that of single inputs.Fig. 5 shows that if the camera is at a distant location, e.g.40cm above, then the localization fitting of original image output is more coherent to the labels thus the model is more depend on the original figure.On the other hand, if the camera is at a close distance, e.g.5cm above the wrist, the iPPG outputs is more coherence with the labels.

Conclusions
In this work, we have shown iPPG which measures the physiological changes of blood flow in artery is capable of localization for TCM radial pulse diagnose, especially when the camera is close to the wrist.An adaptive fusion network has been developed by combined usage of static image recording as well as the iPPG images.
Experimental studies demonstrate that the proposed method can precisely locate the desired palpation position in different approaching paths by automatically adapting the coherent weights of the two modalities.
Comparison study of the performance of each modality reveals the adaptation process of the network to the distance change between the camera and the target.The proposed method provides a new pathway for the dynamical targeting and tracking of radial pulse during the approach of sensor to the wrist.The integration of the new method into the TCM robot will be studied in future work.

Figure 1 .
Figure 1.(a) Conventional TCM pulse diagnostic method.(b) one type of Pulse diagnosis apparatus.(c)Robot-assisted method designed by our group.

Figure 4 .
Figure 4. Model predictions are consistent with experimental recordings at each distance.The results were taken at 40 cm (b), 30 cm (d), 20 cm (f) and 15 cm (h).The left (a, c, e, g) subplots indicate size of palpation sites and the corresponding label used in the neural networks' modals.

Figure 5 .
Figure 5.The coherence of fuse model outputs.

Table 1 .
Model Comparison of single experts and fuse model