Development of a pointing device using surface electromyograms generated by mouth movements

Abstract A pointing device controlled by mouth movements was developed to accomplish hands-free operations of personal computers (PC) or tablet terminals. The mouth movements were measured with surface electromyograms (EMGs) acquired with four pairs of active electrodes attached on the left and right zygomatic and triangularis muscles. The time-varying average rectified value (ARV) was obtained from the detected EMG signals. The position of the mouse pointer was controlled by the normalized ARVs by vector composition. The distance and angle errors were measured to evaluate the operation accuracy of the pointing device system. The experimental results demonstrated that the subjects could control the mouse pointer with the proposed method at any distance in eight different directions.


Introduction
Assistive technologies that are accessible to information equipment with hands-free operations are highly desired for people with severe physical disabilities. Moreover, the development of new input devices is expected to replace the mouse and keyboard accessories of mobile personal computers (PCs) or tablet terminals. To achieve this, assistive devices that use biomedical signals constitute one of the available methods for communications. Typical examples include a braincomputer interface (BCI) in conjunction with the use of electroencephalograms, an eye-gaze input device in conjunction with an electrooculogram (EOG), and an interface in conjunction with an electromyogram (EMG). These methods are still being researched or developed by various researchers or institutions. In the case of BCI, measurement preparation and training require considerable time, and its operational accuracy is lower than other systems (Kennedy et al., 2000;Wolpaw et al., 2000). In the case of EOGs, the burden on the eyes is considerable and it is difficult to use them for prolonged periods (Hori et al., 2006). Several interfaces have been proposed and investigated using surface EMGs. The operational performances of assistive devices that use EMGs are relatively high, and they can achieve complicated operations. Additionally, interfaces using arm muscle activity have been proposed in the cases in which the upper limbs can be moved (Fukuda et al., 2003;Hussain, Spagnoletti, Salvietti, Prattichizzo et al., 2016a;Itou et al., 2001;Pan et al., 2019). However, people with severe disabilities, with paralysis and extensive impairments in motor function, such as spinal cord injuries and amyotrophic lateral sclerosis, may exhibit movement restrictions only around the head. Methods have been proposed for the detection of the a) face direction with electrodes attached around the neck (Moon et al., 2003) and b) the movement of the tongue with surface EMG with electrodes attached around the chin (Sasaki et al., 2016;Zhang et al., 2014). Furthermore, methods have also been proposed for speech recognition with surface EMGs (Cler, Nieto-Castanon et al., 2014;Morse et al., 1991) and the characterization of facial gestures (Hamedi et al., 2013), swallowing (Suzuki et al., 2020), and affective states (Boxtel, 2010). Furthermore, stepwise movement methods have been proposed as a computer interface methodology to realize upward, leftward, rightward, and click inputs (Barreto et al., 2000), and upward, downward, leftward, rightward, and click inputs (Chin et al., 2006;Cler, Michener et al., 2014). In these methods, the distance of the cursor movement is constant and the direction is limited to 3 or 4 directions. Facecomputer interface has been developed to control a robotic arm using audio feedback (Zhu et al., 2021). This method required to move the eyebrows and eyes up and down and the mouth laterally.
It is possible to use the surface EMG method that quantifies mouth movements as a pointing device because complicated voluntary movements can be easily measured with increased accuracy. In this study, we aimed to develop a pointing device that can move the cursor to an arbitrary location without direction or distance limits. We proposed a pointing device that only uses mouth movements. And realized a natural mapping that controls the vertical and horizontal movement of the pointer by the vertical and horizontal movement of the mouth. Using four sets of active bipolar electrodes attached to the left and right zygomatic and depressor angle oris muscles, the distances and directions of the pointer movements were determined from the composite vector generated by the four EMG signals. Moreover, we aimed to improve accuracy by using more intuitive visual feedback. According to our method, we can expect to move the pointer to any place.

Mouth movements and EMG
To determine the two-dimensional pointer movement directions, the components of motion along with the upward, downward, rightward, and leftward directions are required. To obtain the vertical component, we focused on the zygomatic muscle that contributed a component in the upward direction, and on the depressor angle oris muscle that contributed a component in the downward direction. Furthermore, to obtain the horizontal component, electrodes were arranged on both the left and right sides of the zygomatic and the depressor angle oris muscles.
For example, in the case of upward movements, the left and right zygomatic muscles were pulled in the upward directions. In the case of movements in the lower-right directions, the lowerright margins of the lower corner of the mouth were pulled down. In the case of movements to the left, forces were generated on the left zygomatic and depressor angle oris muscles.

System configuration
The EMG signals were measured with dry-type active bipolar electrodes (Osaka Electronic Equipment Ltd, Personal-EMG), as shown in Figure 1  are 10 mm×10 mm×4 mm and 2 g. The electrodes and wires were fixed using medical adhesive tape. Four bipolar active electrodes were arranged on the zygomatic muscle (right: Channel 1 (Ch. 1), left: Ch.2) and the depressor angle oris muscle (left: Ch.3, right: Ch.4), as shown in Figure 1(b). Each electrode's location was adjusted to maximize the amplitude of the detected EMG signal (Moritani & Muro, 1987). Two bipolar bar electrodes were arranged perpendicular to the muscle fiber direction. Since the electrodes were small and lightweight, the subjects were not encumbered with them. A reference electrode of the metal rod was grasped with one hand. Figure 2 shows the block diagram of the EMG-based pointing device system. The detected signals were amplified and input to the PC following digitization at a sampling rate of 1000 Hz. Mouth movement features were extracted from pre-processed signals. Finally, the position of the pointer was determined by the composite vectors of the four-channel signals.
The EMG signals were bandpass filtered from 20 to 400 Hz. The time-varying average rectified value (ARV) was obtained for each EMG signal according to where s i (t) is the surface EMG signal, i is the number of channels, and T is the analysis section length. As the muscle tension increased, the ARV also increased. T was set to 500 ms with reference to the arm EMG interface (Hussain, Spagnoletti, Salvietti, Prattichizzo et al., 2016b). When T was set too short, it becomes difficult to control because of high sensitivity, while when T was set too long, a delay causes the slow response. The questionnaire from the subjects demonstrated that a good evaluation was obtained in 500 ms when changing the T. For accurate parameter setting, it will be necessary to determine the optimum parameters by evaluating the performance as a pointing device.
The amplitudes of the ARV differed for different mouth movements and different channels. To adjust the moving range of the pointer, the ARV was normalized with the maximum value for each face movement according to a preliminary experiment. To normalize the EMG signals, preliminary mouth movement experiments were performed for each subject in advance. Eight mouth movements were performed at 3 s intervals (2 s performance and 1 s rest periods), including upward, downward, leftward, rightward, upper rightward, upper leftward, lower leftward, and lower rightward movements. Figure 3 shows the illustrations of the relevant mouth movements. Specifically, the following eight movements were performed.  Table 1 lists the relationships between the four-channel ARV signals and the eight mouth movements. The maximum value of the ith channel max t ARV i t ð Þ was obtained in three movements (red frame) in For example, in the case of Ch.1 in the upper right direction, the ARV yielded large values for movements in the upward, right, and upper right directions. The maximum value was obtained by each of these three operations. The minimum value among these three movements was associated with the movement in the upper right direction. Thus, minmax1 was set to the maximum value associated with this movement. As a result, normalization was performed based on the following formula.
By normalizing with Equation (3), the amplitude difference among channels and mouth movements can be reduced, and the cursor movement on the display can be adjusted.
To determine the pointing position, the normalized ARVs, g i (t) (i = 1, 2, 3, 4) for each channel, were assigned to the upper right, upper left, lower left, and lower right diagonal vectors (Figure 4). The x and y coordinates of the pointer were determined based on four vectors that were formulated as follows, where L is the coefficient that adjusts the moving distance according to the display size, and (I x ,I y ) is the initial position of the pointer. The state of the pointer was controlled based on the moving distance per unit time d [pixel/s]. When the value d was larger than a predetermined threshold, the pointer was moved according to the mouth movements. Otherwise, the pointer was assumed to be clicked or stopped. In the proposed system, we can choose from two types of pointer controls. First, the pointer always starts at the center of the display, that is, (I x ,I y ) = (0, 0) in Equations (4) and (5). The pointer can be controlled by the contraction of the mouth muscles. Furthermore, the pointer returns to the center of the display by relaxing the mouth muscles. In this case, although the subject wants to move the pointer to an arbitrary position, he/she cannot maintain the pointer at specific positions. This is effective for quick control in the near field. Second, when the mouth muscle is relaxed, the pointer stops at point (I x ,I y ; see, Equations (4) and (5)). The pointer can move from this location. This is suitable for precise pointing in the far-field.

Performance evaluation
The distance and angle errors were measured to evaluate the operation accuracy of the pointing device system. The subject controlled the pointer to approach the presented target using mouth movements.
The distance error R e (=P x (t) 2 + P y (t) 2 ) and angle error θ e (=tan −1 (P y (t)/P x (t)) were calculated based on the following equations: where R t is the distance between the origin and the target, R p is the distance between the origin and the pointer, θ t is the angle of the target, and θ p is the angle of the pointer. The distance and angle errors were measured 2 to 4 s after the target was indicated. The distance between the display and the subject was 50 cm. The size of the display was 15.6 inches and the resolution of the display was 1920 pixels×1080 pixels. The resolution of the operation window was 400 pixels×400 pixels. The actual length of 100 pixels was 2.3 cm. The parameter L in Equations (4) and (5) was set to 100 pixels. The eight mouth movements used in the preliminary input paradigms were performed. Figure 5 shows an example of the normalized ARV signals from the four channels when subjects moved their mouths in each direction. The subjects repeatedly moved their mouths in the indicated direction for 2 s and then returned it to their normal state within a period of 1 s. Motions were executed in the upward, downward, leftward, rightward, upper right, upper left, lower left, and lower right directions. The patterns of the obtained results based on EMG normalization were depicted in Table 1. Figure 6 shows the pointer trajectory without visual feedback. The subject was going to move the pointer in the horizontal and vertical directions (Figure 6(a)) and diagonal directions (Figure 6(b)). Five  trials were performed in each direction. The pointer could be moved continually in eight directions. However, there was a rightward bias when the pointer moved downward and a variation was noted when it moved in the lower-left direction without feedback. Figure 7 shows the pointer trajectory with visual feedback. Visual feedback on the current position of the pointer was given to the subject. The bias and variation were reduced compared with Figure 6. The variation of sideward direction looks larger than that of other directions such as vertical and diagonal directions.

Experiments
Next, the performance of the proposed system was evaluated to examine its usefulness as a communication support device. We experimented by changing the distance and direction of the target from the origin. The distance was set to 20, 40, 60, 80, and 100 pixels. In total, 8 angular directions were used from 0° to 360° in 45° increments when the positive direction of the x-axis is 0°. The position of the target randomly changed every 5 s. Three sets of experiments were performed for each distance and each direction. The subject was instructed to fix the target and move the cursor closer to it. Figure 8 shows an example of the transient response of the (a) distance and (b) angle for the target. The distance and angle of the target were 60 pixels and 22.5°, respectively. The trajectory of the pointer is also shown in Figure 8(c). Based on the transient response of the cursor position, it was found that the distance and angle were stabilized at 1 s after the target presentation.
The distance and angle errors were obtained for each distance and direction. The distance and angle errors were calculated with the analysis interval set to 2 s to 4 s. Figure 9(a) shows the relationship between the target and pointer distance. Figure 9(b) shows the distance error between the target and pointer. These plots show the means and the standard deviations of 5 subjects. The distance error increased when the target was farther from the origin. Figure 10(a) shows the direction error between target and pointer when changing the target distance. The direction error was less than ±10° based on the consideration of the deviation, regardless of the target distance. Figure 10(b) shows the relationship between the target direction and pointer direction. Figure 10(c) shows the direction error between target and pointer. In this case, the direction error was less than ±10° based on the consideration of the deviation, regardless of the target direction.
Paired two-sample t-tests were used to evaluate the distance and direction accuracy by assuming two distributions be the same variances. From the results of Figure 9(a), the difference in distance of 20 pixels could be discriminated at the significance level of 1%. In addition, from the results of Figure 10(b), the difference in the direction of 45 degrees could be significantly discriminated. Since significant differences were observed in all pairs, the descriptions in the figures are omitted.

Discussion
Usability is an important key in human-computer interfaces. The interface that uses the face such as the eyebrows, eyes, and mouth mentioned above allows various input operations (Zhu et al., (a) Relationship between target distance and pointer distance. The solid line shows target distance.
(b) Distance error between target and pointer as a function of target distance.  2021). However, since it is necessary to pay attention to various parts of the face, the operation becomes complicated. In this study, we have developed an interface that uses only the mouth. By making the direction and strength of the mouth movements correspond to the direction and the moving distance of the pointer, more natural input becomes possible. In addition, by visually feeding back the position of the pointer, more intuitive control becomes possible. In the future, we plan to perform a more detailed performance evaluation from the viewpoint of usability.
We considered the ARV calculated from the four-channel EMG signals shown in Figure 4. For example, in upward mouth movements, a large amplitude potential was observed at the electrodes Ch.1 and Ch.2 placed above the mouth, while a smaller amplitude potential was observed at the electrodes Ch.3 and Ch.4 placed below the mouth. That is, upward movements of the mouth can be represented by combining the electrodes of Ch.1, Ch.2, Ch.3, and Ch.4. In the other directions, such as the vertical, horizontal, and diagonal components, the directions of mouth movements coincided with the directions of the electrodes at which large potentials were observed. It was possible to operate in any direction by combining the four components.
In the present method, four-channel ARV signals were normalized to make the amplitude uniform. Actually, without normalization, a higher potential was confirmed for Ch.1 during the upper right operation compared to Ch.2 during the upper left operation. Additionally, in the case of the right mouth movement, the potential of Ch. 1 was higher than that of Ch.4. Thus, there were amplitude differences in the measured ARV signals between channels and between operations. Therefore, if the amplitude of the ARV signals is combined in each direction, the moving range of the pointer will be biased. Therefore, normalization was performed to reduce the differences in amplitudes and adjust the moving range of the pointer. There are multiple actions where a large potential is observed in a channel. After the identification of the maximum amplitude for each operation, the minimum values were also identified that facilitated normalization.
In the cases at which there was no visual feedback, deviations and variations of the trajectories of the pointer were relatively large, as shown in Figure 6. Conversely, the deviations and variation could be suppressed when visual feedback is provided, as shown in Figure 7. It is considered that the pointer could be controlled more accurately by adjusting the EMG of mouth movements according to the feedback. However, even when feedback was provided, there were cases where variations were observed in the rightward direction in Figure 7(a) and the rightward direction in Figure 7(b). Hereinafter, the accuracy was evaluated when feedback was provided to the subject. There are variations in the sideward direction as shown in Figures 6 and 7. The diagonal movement can be achieved by using a single ARV signal. The vertical movement can be achieved by bilateral symmetrical movement of the mouth. Thus, these directions are relatively easy to control. On the other hand, in the sideward direction, it is necessary to move the mouth to the side horizontally while balancing the upper zygomatic muscles and the lower depressor angle oris muscles. Therefore, it is considered that the variation when moving in the side direction became large. However, there was no difference in the error depending on the direction from the experimental results of Figure 10(c). This variation was considered acceptable.
Based on the experimental results in Figure 9 and Figure 10a, we determined the accuracies associated with the distance and direction as these pertained to the control of the pointer. The distance error increased as the target was far away from the origin, as shown in Figure 9(a, b). This is probably because it is difficult to keep muscle tension high. It is required to compensate the pointer position according to the strength of the ARV signals. Conversely, the direction error was almost constant regardless of the target distance, as shown in Figure 10(a). This is probably because it is easy to exert control by combining multiple muscles using the feedback of the pointer position.
As shown in Figure 10(a, c), the direction error, including the standard deviation, was within ±10°. Additionally, there were no significant differences among directions. It is thought that the pointer can be controlled in eight directions at 45° increments. However, in some subjects, the angular error was larger than 10°, depending on the direction. This may be attributed to the subject's unsatisfactory mouth movements. Therefore, it is necessary to develop a training system for mouth movements. In addition, it is necessary to improve the operation accuracy by setting signal processing and by determining the pointer coordinates based on preliminary input according to the subject.
In the proposed pointing system, the pointing operation was performed around the origin. The pointer moved to owe to the myoelectric activity generated by the movement of the mouth, and the pointer returned to the origin when the mouth force was released. With this system, it was possible to click on an arbitrary part by keeping muscle activity constant for a certain time when the pointer moved to the target part. Based on the experimental results of transient responses, it was considered appropriate to maintain the pointer's position for approximately 2 s. Alternatively, it is possible to maintain its position without returning the pointer to the origin even if the force was released. This can be used as the origin for the next operation. With this method, it is possible to select a target by performing multiple operations.

Conclusion
A pointing device that can be operated by mouth movements was developed in conjunction with the EMG measured with four pairs of surface electrodes attached to the left and right zygomatic and depressor angle oris muscles. The proposed device achieved pointer operation that corresponds only to the movements of the mouth while giving visual feedback. Based on the experiments, it was shown that the developed system could control pointing in eight directions with high usability.
Future tasks include the improvement of the pointer operation accuracy and operability.