1 Introduction

There are various force-input devices and force-input operation has been gradually become common with the adoption of touch-input panels. Recent research results include augmented feedback [1, 2] and force-input interaction on foam objects [3, 4], and the field of force-input interaction continues to be developed. However, force control remains more difficult to learn than position or velocity control. This is because force control, unlike position/velocity control, relies mainly on somatic feedback with only slight visual/auditory feedback.

The understanding is that augmented feedback will help users to operate devices with high accuracy and/or efficiency, e.g. visual feedback helps text input. Many studies use augmented feedback to enhance manipulation capability, but an unwanted side effect is that users come to rely too much on augmented feedback, especially with augmented visual feedback [5]. A familiar example is touch typing. In initial training, visual feedback (looking at the keyboard) helps users to develop skill in typing, but then it hinders the acquisition of true touch-typing skill. On the other hand, study [5] showed that learning based on auditory feedback allows users to learn to perform equally well with or without feedback. Recent work examined the effectiveness of some kind of multimodal augmented feedback [6], and this work focuses on the effect of augmented visual-auditory feedback in learning how to control force.

The aim of this paper is to introduce a method that supports users when initially learning how to use force-input devices. We examine what kind of feedback best enhances the learning of force-input interaction in terms of motor skills. In this paper, we focus on the acquisition of learning force control, not operation, and investigate the learning effect of augmented feedback. Once users learn to operate a device without feedback, they will operate the device more rapidly with reduced user burden, like eyes-free operation. For force-input operation, learning to operate devices without feedback means to acquire anticipatory force control. As rapid learning is more effective, long practice times bother users, our case study examines if short-term visual-auditory feedback training can promote the acquisition of highly accurate force control without feedback.

We conduct an experiment in which participants applied pushing force (using their thumb) to a foam cube under different augmented feedback conditions: (1) visual feedback, (2) auditory feedback and (3) visual-auditory feedback; the feedback provides an experience similar to playing a musical instrument. The experiment uses eight levels of pushing force, and we examine how accurately the subjects could produce the levels of pushing force under each feedback condition. We tested whether the performance enhancement achieved with feedback remained in the absence of the feedback. The results confirmed that augmented visual-auditory feedback was more effective for learning pushing force control than just augmented visual or auditory feedback in isolation.

2 Related Works

2.1 Force-Input Interaction

There are various force-input devices and several studies have examined force-input interaction. Force-input interaction can be used as an alternative to touch-input and a supplement to enhance existing input interaction. Several different approaches to augmented feedback have been investigated. Study [7] conducted experiments to examine levels of pressure that users could easily discriminate with visual feedback given target selection task by varying a stylus’ pressure. Study [2] introduced force-input that gives the feeling of being pressed and study [8] showed how to enrich the representation methods. However, these feedback studies focused on improving input operation. Our intention is to achieve the situation in which users can use force control without augmented feedback, and so we focus on the effectiveness of feedback-based learning.

Recently, force-input interaction on foam objects has been researched and methods for interacting through foam objects attached to smart phones [3], or cushions [4] have been developed. This paper focuses on foam objects. Foam objects allow users to better feel finger resistance and help users to learn force control. In fact, they are also used for rehabilitation for force control.

2.2 Motor Learning and Augmented Feedback

Motor skills have the aspects of both feedback control and feed-forward control. In feedback control, real-time feedback is used to control the present motion. In feed-forward control, an initial command is sent to the muscle (internal proprioceptive) to effect the push, and feedback information is used to raise the accuracy of the next motion, not the present motion. Rapid motion control and anticipatory motion control are components of feed-forward control. Effective anticipatory motion control reduces the user’s burden, and so we focus on feed-forward force control.

To improve the accuracy of feed-forward motion control, it is necessary to develop a better internal model, and inverse models are to be preferred. The understanding is that good inverse models are constructed through feedback error learning [9], and providing accurate sensory feedback information leads to the acquisition of high accuracy motion control. Many researchers have investigated the effects of augmented feedback, also known as extrinsic feedback in motor learning [10]. In general, augmented feedback has been shown to be highly effective.

Study [5] focused on learning for a wrist bimanual coordination pattern and their results show that people who use augmented visual feedback become dependent on it; accuracy is high only if the feedback is provided, whereas the auditory feedback group performed equally well with or without feedback. Recent work has examined the effectiveness of some kind of multimodal augmented feedback. On the other hand, real time continuous feedback has been shown to degrade time control learning [11, 12]. It is thought that some types of feedback create dependencies rather than skills. Thus, feedback design considering feedback timing and kinds of feedback is important.

2.3 Force Control and Augmented Feedback

In force control, given a target value, the trajectory reaches an extreme value (called peak value hereafter), and thereafter approaches the target value. It is thought that the peak value reflects the degree of anticipatory force control and the convergent period reflects feedback force control. As to pushing force control, study [12] showed that continuous visual feedback degraded the accuracy of force control, except after a peak value. Study [13] focused on the effects of auditory feedback for prolonged force control, manual and oral force. However, their experiments focused on just one force level and on just visual or auditory feedback.

We conducted an experiment to verify the effectiveness of multimodal feedback in learning pushing force control, especially anticipatory force control. In the experiment, participants were trained to produce eight levels of pushing force under three combinations of augmented feedback: (1) visual feedback, (2) auditory feedback and (3) visual-auditory feedback.

Fig. 1.
figure 1

Experimental layout and deformation level

3 Experiment

3.1 Participants

Twelve adults (5 females, 7 males) aged between 23 and 36 (mean = 27) years participated. The experiment was conducted for each hand under the different augmented feedback conditions (8 persons by each feedback condition). The order of dominant hand and the order of feedback condition were counterbalanced.

3.2 Experimental System Design

Figure 1 is a schematic illustration of the experimental set-up. The experiment used a foam cube, the target of the pushing action, and a PC (MacBook Air 11 in.) controlled the augmented visual and auditory feedback. The cube had sides of 20 cm, and a bending sensor (SFESEN-08606, 4.5 in.) was attached to the middle of one face of the cube (target face). Participants were told to push the central point of the target face, which was marked by a sticker. The bending sensor was connected to an Arduino Uno connected to the PC. When the target face of the cube was pushed, the Arduino Uno captured the change in the resistance values of the sensors and sent the results to the PC, which calculated the degree of surface deformation every 100 ms. The cube and the PC were placed on a desk with an occluder (60 cm \(\times \) 40 cm) between them, so that the participants could not see their hand during the experiment. The distance between middle of the cube and participants’ body was 50 cm. The PC was placed 40 cm in front of the participants.

In handling or using the cube, the level of force changes with the softness of the surface, so we used the level of object deformation (captured by the bending sensor) instead of strict weight values. The right lower panel of Fig. 1 shows simplified changes of the level of surface deformation. These changes were grouped into eight levels (B4, C5, D5, E5, F5, G5, A5, B5) called deformation levels here. The value of the feedback change depends on the value of the bending sensor deformation, which is not a linear indication of the deformation of the target surface. This is reasonable as we want to know whether users can learn new force levels, not a linear relation. The number of levels was, according to a preliminary experiment with three participants (none participated in the formal experiment), set to be neither too difficult nor too easy because we wanted to verify if participants could discriminate the level of pushing force properly.

The PC changed the feedback in a step-like manner in response to the force produced by the participants. The augmented feedback was realized by a line and bar on a visual display and/or simple tones; where height and tone indicated the deformation level (Figs. 1 and 2). Details are shown below. (1) Visual feedback: The resolution of PC display was 1366 \(\times \) 768 pixels and the size of the visual display was 800 \(\times \) 600 pixels. The target was presented as a black line and the present force level was presented as by a green bar (10 \(\times \) 50 pixels) on the display. (2) Auditory feedback: The target and the present force level were described by sound tones. The sound signal output, a single sine-wave (frame rate = 60 Hz, portamento speed = 200 ms), was fed to the PC’s speaker. B4-B5 were named after piano tones (329.63, 349.23, 392.00, 440.00, 493.88, 523.25, 587.33, 659.26 Hz). (3) Visual-auditory feedback: mixture of (1) and (2)

Fig. 2.
figure 2

(a) The blue line shows the trajectory of deformation over time at the 2nd training and the green line shows that at the 3rd test. The pink area shows the target level. When the participants controlled their force of thumb pushing appropriately, the system counted the event as a correct answer. (b) An enlargement of the area delineated by the dotted line in (a). (Color figure online)

3.3 Experimental Procedure

In this experiment, participants were instructed to push the target surface to reach the presented target deformation level as soon as possible. During training, both the target deformation level and the present deformation level produced by the participants were given to participants. After training, participants took the retention test; only the target deformation level was given. In training, each level triggered one kind of augmented feedback. Participants were instructed to push the cube with the thumb of the specified hand. The experiment was conducted for each hand under different augmented feedback combinations. Each participant followed the order of 1st training, 1st test, 2nd training, 2nd test and 3rd test on each hand. Based on recent findings on augmented visual feedback [11, 12], augmented feedback was given after the peak value was reached. To raise the effectiveness of the training, participants were told that feedback would be given only after the pushing force became stable. At the end of the experiment, participants answered the questionnaire shown in Table 1 with scores ranging from 0 to 5 (0: I do not think so at all, 5: I really think so).

Table 1. Questionnaire items

Task. Each target deformation level corresponded to one tone of a melody of a well-known children’s song, analogous to playing a musical instrument. This approach should make learning similar to playing a musical instrument, and thus motivate participants in performing the experiment and form an unconscious association between force and feedback, which is expected to yield more accurate force control. The target deformation level was proposed along with the melody in order, and participants were told to push the target surface with their thumb to reach the proposed level. Training and test used five target deformation levels. From the 1st training to the 2nd test, the same song “Tulip” and five target deformation levels (C5, D5, E5, G5, A5) were used; total target number was 33. From the 1st training to the 2nd test, participants were considered to learn not only the proposed 5 levels but also F5 (appears in the 3rd test) which corresponds to the gap between E5 and G5 and the level beyond A5 and level below C5 (not touched), i.e., eight levels in total. Each target deformation level was presented for 3 s regardless of the rhythm of the melody, after which the system automatically switched to the next level. The time interval was decided from the results of a preliminary experiment. Figure 2(a) shows target deformation levels and an example of surface deformation in the 2nd training and the 2nd test. Figure 2(b) shows an enlarged area of Fig. 2(a). Each test and training session took about two minutes. Through the task, participants were trained to distinguish relative force changes, not absolute force levels. To know whether participants could learn the absolute deformation level or not, participants took the 3rd test. In the 3rd test, a different song, “Bienchen Summ Herum!” and five target deformation levels (different from above, (C5, D5, E5, F5, G5)) were used and the total target number was 32.

Fig. 3.
figure 3

The feedback modes are described, (1) Visual feedback, (2) Auditory feedback. Visual-auditory feedback is combination of (1) and (2). This exam-ple shows the feedback for data points A and B in Fig. 2(b).

Training. Figure 3 shows how larger surface deformation raised the height of the visual feedback bar on the display and the tone of the auditory feedback signal during training. The target value and level produced by the participant were presented as follows. (1) Visual feedback: On the display, the target deformation level was presented as the target line and augmented feedback was shown as the height of the visual feedback bar, see Fig. 3(1). (2) Auditory feedback: From the speaker of the PC, the target deformation level was presented as a sound tone and augmented feedback was output as a sound tone, see Fig. 3(2). (3) Visual-auditory feedback: combination of (1) and (2).

Test. After training, a learning test was conducted to determine how well the participants had learned deformation level control. In this test, the target deformation level was presented in the same way as in training but no feedback was provided.

3.4 Determination of Accuracy

We used all data except the first 3 s part of songs and last 3 s parts when the same sound continued. Given a trajectory of strength of force over time, the first extreme value (peak value) reflects anticipatory force control and the following convergence region reflects feedback control [12, 14] (Fig. 2(b)). We used this values in determining participant performance. We focused on the peak value mainly, and if it was inside the target deformation level (a range), we counted it as a correct response. The method of determining accuracy was informed to the participants in advance.

3.5 Result

Fig. 4.
figure 4

(i) The correct answer rate in each training and test. (ii) The correct answer rate in training and test.

Fig. 5.
figure 5

The difference absolute value from the center value of a correct area in the 3rd test.

Figure 4(i) shows the correct answer rate in training session and test and Fig. 4(ii) shows the correct answer rate in each training session and test. We did t-test with Bonferroni correction for trainings, and there was significant difference between the visual feedback and auditory feedback (p < 0.001) and there was significant difference between the auditory feedback and visual-auditory feedback (p < 0.001). In the 3rd test, ANOVA identified a marginally significant effect of feedback type (F (2,18) = 3.456, p = 0.0769), no significant effect of hand used (F (1,18) = 0.402, p = 0.5418), and these effects did not interact (F (2,18) = 0.675, p = 0.5329). With Bonferroni correction, there was a significant difference between auditory feedback and visual-auditory feedback (p = 0.038) and there was no significant difference between visual feedback and visual-auditory feedback (p = 0.279). For each feedback condition, we did t-tests between the 2nd training and the 2nd tests and the 3rd test. With Bonferroni correction, for visual feedback, there was a significant difference between the 2nd training and the 2nd test (p = 0.019) and there was a significant difference between the 2nd training and the 3rd test (p=0.001). For auditory feedback, there was a significant difference between the 2nd training and the 3rd test (p = 0.012) with Bonferroni correction. For visual-auditory feedback, there was a significant difference between the 2nd training and the 3rd test (p = 0.011) with Bonferroni correction. Figure 5 shows the difference in absolute value from the center value of the correct area in the 3rd test. ANOVA identified a significant effect of feedback type (F (2,381) = 19.906, p < 0.001), significant effect of hand used (F (2,381) = 26.5, p < 0.001), and these effects interacted (F (2,381) = 12.819, p < 0.001). With Bonferroni correction, on dominant hand, there was a significant difference between auditory feedback and visual-auditory feedback (p = 0.012) and there was no significant difference between visual feedback and visual-auditory feedback (p < 0.001), and on non-dominant hand, there was a significant difference between auditory feedback and visual-auditory feedback (p < 0.001) and there was no significant difference between visual feedback and visual-auditory feedback (p < 0.001)

Fig. 6.
figure 6

The subjective score of effort of training (left). The subjective score of ease of learning (right).

The left of Fig. 6 is the result of the subjective evaluation of burden of training and the right of Fig. 6 is the result of the subjective evaluation of ease of learning.

4 Discussions

We evaluated the learning effect of each feedback in terms of three items, (i) high correct answer rates in test (ii) small difference in correct answer rate between training and test (i.e., strong learning retention) (iii) less burden.

Fig. 7.
figure 7

The distribution of reaction time

Fig. 8.
figure 8

The distribution of each difference from a target deformation level in the 2nd test and the 3rd test.

(i) High Correct Answer Rates in Test. To determine whether the participants could learn the deformation level independently of the task, we focus on the results of the 3rd test, which used a different song. The results show that visual-auditory feedback gives higher accuracy than auditory feedback, regardless of the order in which the changes in the pattern are presented. It is thought that visual-auditory feedback offers better training of force control than auditory feedback independently.

For more detail, Fig. 5 shows the difference in absolute value from the center value of the correct area in the 3rd test with regard to dominant hand. It shows learning proficiency with non-dominant hand with visual and auditory feedback is not sufficient, whereas with visual-auditory feedback, there is the potential for high learning proficiency regardless of dominant hand.

(ii) Strong Learning Retention. For visual feedback and auditory feedback, the correct answer rate tends to decrease from the 2nd test to the 3rd test (Fig. 4). These results show that visual feedback fails to maintain the correct answer rate, which agrees with the known characteristics of visual feedback [5]. Figure 8 shows the distribution of each difference from the target deformation level in the 2nd test and the 3rd test. In the 2nd test, all feedback types had the highest rate at zero difference (mean correct answer) and yielded sharp response curves. In the 3rd test, the sharp shape was kept only for visual-auditory feedback training. Visual and auditory feedback training yielded broad plots. This suggests that visual-auditory feedback allowed the participants to remember the deformation level with high retention rate, regardless of changes in the target pattern.

(iii) Less Burden. Subjective evaluations were made of the effort in training and the ease of learning. Figure 6 shows that auditory feedback training made participants feel more effort and less easiness compared with visual-auditory feedback and visual feedback training, and the load of auditory feedback may be mitigated if combined with visual feedback. One possible reason is that auditory information provides lower spatial resolution than visual information. Some participants answered that it was difficult to discern sound tone because their inexperience with music. This suggests that using more clearly different tones would probably enhance auditory feedback performance.

Other. Figure 7 shows the distribution of reaction time in the tests. It shows that visual and visual-auditory feedback yield response curves that are shaper and shifted more to the fast side than auditory, which leads to more stable training.

We found that F5 (which was never offered in training and was used in the 3rd test for the first time) had basically the same correct rate as the other levels. It is considered that participants could acquire the gap level without being directly trained.

The interview responses indicated that participants said that they if knew the melody, they could anticipate the next target deformation level more easily with auditory feedback and that the experience was enjoyable. This suggests that using well-known songs will not only motivate the user but also allow the user to predict the next target deformation level easily and reduce the effort needed to identify the target deformation level. This would lead users to pay more attention to control force than when randomly selected target deformation levels are used.

In the experiment, the total training time was about 5 min and the correct answer rate remained at 70 %, so proficiency is not really adequate. Moreover, few participants undertook the experiments, so we couldn’t examine training effectiveness independent of individual. Future work includes more training time and more participants and clarifying the mechanism underlying the learning characteristics of each sensory modality.

Overall, the results confirm the effectiveness of visual-auditory feedback as a tool for learning how to control pushing force in short-term training. It is thought that visual-auditory feedback yields better force control during both training (due to visual feedback) and testing (due to auditory feedback) than visual feedback or auditory feedback independently.

5 Conclusion

This paper conducted an experiment to verify the effectiveness of multimodal feedback for training people to control pushing force. The experimental setup linked the degree of deformation of a foam object to visual feedback, auditory feedback, or both. Participants were told to control their pushing force and to learn the push level during training and the accuracy of level reproduction was checked in the absence of feedback. As a result, visual-auditory feedback was confirmed to be more effective than either visual feedback or auditory feedback in isolation. We can apply the results of this study to enhance the first use of force-input devices. Just five minutes of practice similar to playing a musical instrument can enhance force reproduction skill and make subsequent force inputs more accurate.