1 Introduction

According to the World Report on Disability [1] in 2011, it is estimated that there have been about one billion people with several disabilities. Besides, about 2% of the world population—between 110 and 190 million people—have severe disabilities in functioning. People with motor-impairments—as a result of amyotrophic lateral sclerosis, carpal tunnel syndrome, spinal cord injury or degenerative diseases—require assistive technology solutions to have a more independent life. CATs are considered as one of the most efficient examples of these solutions enabling hands-free computer access. They are generally based on human–computer interaction (HCI) techniques where a mouse cursor is controlled by the user’s complete head control ability. But when it comes to people with only reduced head movement (i.e., the ones who cannot operate the mouse cursor by moving head), computer access becomes a very challenging task since the users have to interact with a computer by a single head-gesture like a head nod or a head tilt.

Computers have become indispensable tools with their immense services in our increasingly digitalized world. Unfortunately, most people with only minimal head movement lack these services, since they have difficulties to interact with their computers by means of current solutions. The World Report on Disability [1] also reveals that 80% of people with disabilities live in low- and middle-income countries, which means that the majority of people with only minimal head movements might not afford most hands-free HCI solutions [2,3,4], since they are generally depending on expensive devices. Although the aim of universal access is enabling equal opportunity and access to a service or product regardless of people’s physical disabilities by reducing barriers, the high-cost of most current solutions creates a new barrier financially for the majority of target group. On the other hand, according to the International Labour Organisation (ILO) statistics [5] published in 2007, an estimated 470 million of the world’s working age people live with several disabilities. Although there have been many jobs which are dependent only on computer usage like software coding, exclusion of millions of working age people with disabilities from the labor force leads to an increase in the Gross Domestic Product (GDP) lost worldwide. Furthermore, they lack a paid job which makes them feel more independent by affording themselves financially. It is obvious that any new HCI technique based on a single head-gesture will play an important role to develop better CATs to enable these people operate a computer for a more inclusive and barrier-free life.

In accordance with our efforts to find a solution for people with only reduced head control to interact with a computer by a single head-gesture, we began with a review of the current head-operated solutions in Related Works section. We noticed that the majority of interaction techniques requires a complete head control ability. In other words, there are limited solutions which are capable of supporting single head-gesture access for people with reduced head movements. As a result of our literature review, we identified two major problems of current head-operated interaction techniques with a single head-gesture access support: (1) requirement of dedicated devices, (2) compatibility with switch-accessible interfaces. To overcome these problems, we employed our software switch approach of which first examples were previously presented in Esiyok et al.’s study [6].

We proposed two novel interaction techniques namely HeadCam and HeadGyro by following the principles of the software switch approach. Both interaction techniques, major problems of the current solutions, and our software switch approach were explained in detail in Software Switches section. In a nutshell, both interaction techniques can serve like traditional switches by recognizing the head movements via a standard camera or a gyroscope sensor of a smartphone to translate them into virtual switch presses. Furthermore, they do not require a dedicated device and are compatible with most of switch-accessible interfaces. As low-cost alternatives, they can be replaced with expensive traditional head switches for computer access. They are also capable of recognizing any motion of the other body parts, such as the user’s shoulder or leg, which makes them quite flexible switches. By this way, different physical gestures can be targeted easily, when the user becomes tired. Besides, both proposed software switches do not require physical strength to be activated unlike physical switches; especially HeadGyro can even detect a minimal head movement to transform it into an emulated switch press.

A usability study with 36 participants (18 motor- impaired, 18 able-bodied) was conducted in order to collect objective and subjective evaluation data. The SITbench 1.0 [7] benchmark was employed for objective evaluation. Moreover, we also applied a System Usability Scale (SUS) [8] questionnaire for subjective evaluation. While HeadGyro showed slightly higher performance than HeadCam for each objective evaluation metrics, HeadCam was rated better than HeadGyro in subjective evaluation. All participants agreed that the idea of controlling a computer via a single head-gesture without requiring any dedicated device sounded very promising.

Given that the majority of the current solutions requires expensive dedicated devices, and that 80% of people with disabilities live in low- and middle-income countries [1], proposed software switches are expected to have a considerable impact. Currently, they are the only options for people with reduced head control (i.e., those who have to use a switch-based system for computer access) who cannot afford any dedicated device. On the other hand, considering there have been many jobs which are dependent only on computer usage like software coding, any tool for computer access undoubtedly helps these people to participate in the labor force, which will result in a decrease in the global GDP lost. Furthermore, the ones who can perform a paid-job will feel like they are more independent by affording themselves financially. Also, software switches can be employed as alternative inputs for multi-modal HCIs beyond assistive technology related purposes. Since HeadGyro software switch is not affected by external factors like light or wind, it could also be employed for outdoor activities (e.g., operating a wheelchair).

This paper proceeds with the Related Works section to summarize the current head-operated interaction techniques for computer access. In the Software Switches section, we identify the common problems of current interaction techniques and introduce our software switch approach with two software switches called HeadGyro and HeadCam proposed within this paper. Subsequently, we evaluate both interaction techniques by presenting objective and subjective evaluation results of our usability study in the Evaluation section. Finally, we conclude and discuss our study in the Conclusion and Discussion section.

2 Related works

In this section, from a broader perspective, we reviewed the current head-operated HCI solutions that provide alternative means for computer access. We preferred to separate them into two main groups according to the condition whether they have a single head-gesture access support.

2.1 Head-operated interaction techniques without a single head-gesture access support

Interaction techniques in this group require a complete head control ability for hands-free computer access. In principle, they translate the users’ head movements into mouse cursor movements in several ways:

One of the most popular techniques is wearing inertial sensors, such as a gyroscope or an accelerometer on the head (via a helmet or a cap) to control a mouse pointer [9,10,11,12,13,14,15,16,17,18,19]. These inertial sensor-based systems are mostly combined with a different sensor/switch to perform a mouse click task (e.g., in a way that head movements are detected by inertial sensors to control mouse pointer, and mouse clicks are performed by a puff switch). Another sensor-based solution called Headmaster Plus [20], which was evaluated in the work by LoPresti et al. [21], consists of ultrasonic sensors. Briefly, the user wears a headset including three ultrasonic sensors that wait an ultrasonic signal from a stationary transmitter on the user’s computer. In this way, ultrasonic sensors determine the orientation of the user’s head to convert them into mouse pointer coordinates.

Using a head pointer—a head-worn stick in principle—is another solution which permits the users to control, press or touch any target [22] by head, although this method is rarely preferred nowadays. Similarly, head-operated joysticks are alternative tools which enable the users to point mouse cursor on the screen [23].

On the other hand, a specific part of the user’s face (e.g., the tip of the nose) or the user’s whole head can be tracked by a standard camera in order to transform head movements into mouse cursor movements on a computer screen [24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44]. Mouse click tasks, such as left or right click, are generally performed with the dwelling method (i.e., the user holds the mouse cursor steady for a given amount of time to perform the click tasks) or with multi-modal approaches by means of other gestures like eye-blinks or tooth-clicks.

In addition to the above-mentioned approaches, head movements can also be followed by special camera-based systems to control a mouse cursor. In such systems, the user wears small reflective dots on his/her head/face or an infrared LED (light-emitting diode) which is placed on a helmet or a pair of glasses. These reflective dots are illuminated by an infrared or near infrared light source, and then a standard camera [45,46,47] or an infrared camera [48] tracks the position of target signals (coming from reflective dots or an infrared LED) for mouse cursor pointing. On the other hand, RGB-D cameras as new vision sensor technologies are also able to do 3D mapping of head position to control mouse pointer [49].

2.2 Head-operated interaction techniques with a single head-gesture access support

For those with only reduced head control, there have been limited solutions which are able to support single head-gesture access. Using a traditional button switch via a scanning interface is a common technique where a head switch is mounted close to the user’s head in a way that the user can hit it by tilting the head (or by any activity moving head) [50, 51]. In addition to traditional hardware switches, there are just a few software-based solutions [52, 53] demonstrating mouse click tasks with a single head-gesture. In software-based solutions, first the users are enabled to navigate the mouse cursor to the desired location by vision-based head tracking methods, and then mouse clicks are emulated according to the users’ head-gestures as an alternative to dwelling method.

3 Software switches

This section begins with a subsection which explains how we handle the detected problems of current interaction techniques by applying our software switch approach. Then, we introduce the common user interface of both software switch proposed. Afterward, HeadCam and HeadGyro software switches are also explained, respectively.

3.1 The software switch approach

Our software switch approach has two principles: an interaction technique based on our software switch approach (1) should not require any dedicated devices, and (2) should be configurable to be compatible with switch-accessible interfaces. By following these principles, we proposed two interaction techniques within this study. Detected major problems of the single head-gesture compatible interaction techniques (in Sect. 2.2) and proposed solutions based on our software switch approach are presented below:

  1. 1.

    Requirement of Dedicated Devices: The majority of current solutions for computer access depend on dedicated devices which might be hard to afford for the ones living in low- and middle-income countries [2,3,4]. The high-cost of dedicated devices leads to a new financial barrier. Any new efficient solution based on an expensive device will not make any sense for these people, unless proposed solutions are affordable for them. Therefore, as the first principle of our software switch approach, interaction techniques for people with reduced head control should not require any dedicated device beyond standard computer peripherals like a microphone or a camera. At this point, as the only reasonable exception, we decided to exclude smartphones from the dedicated devices list, because the total number of smartphones—3.2 billions in 2019 [54]—got ahead of the total number of computers in recent years worldwide [55], which makes them easy to access for people in even low-income countries. Besides, smartphones are capable of providing several services to the users unlike dedicated devices which are produced with a specific aim. To sum up, while software-based solutions [52, 53] do not require any dedicated devices, traditional button switches are dedicated devices beyond standard computer peripherals. As low-cost solutions, HeadCam and HeadGyro software switches are based on a standard camera and a gyroscope sensor of a smartphone, respectively;

  2. 2.

    Compatibility with Switch-accessible Interfaces: The majority of current solutions reported in literature are only compatible with a specific switch-accessible interface. To make it clear, first the mechanism of a scanning-based interface and standardization problem should be understood. In principle, unlike direct selection (such as typing on a keyboard), the scanning interface highlights items one-by-one on the computer screen, and the user activates the switch when the desired item is highlighted. Between switch-accessible interface and the switch, there is a switch adapter which is a dedicated device to transform switch activation signals into meaningful keyboard presses or mouse clicks. Following a switch activation, switch adapter emulates a specific keyboard character or a mouse click event (depending on the manufacturer of switch interface) and send it to the computer in order to communicate with switch-accessible interface. But the main problem in this case is that there has not been any commonly agreed standard for the communication between switches and switch-accessible interfaces; while some switch- accessible interfaces expect to receive a specific keyboard character like space, the others expect to receive a mouse click. This standardization problem is partially solved by a switch driver software permitting the users to assign a specific character or mouse click—following a switch activation—which is expected by the target switch-accessible interface. However, these switch driver software are only compatible with a limited number of switch adapters of specific brands, which makes them partial solutions for the standardization problem. In other words, each switch adapter requires its specific switch driver software. Although current software-based solutions [52, 53]—which are able to emulate mouse clicks—support single head-gesture and do not require any dedicated device, they are only compatible with specific switch-accessible interfaces which can be controlled with a mouse click as a switch input signal. To the best of our knowledge, there is not any complete solution for this standardization problem in literature. Both interaction techniques proposed within this study can be configurable to generate any expected keyboard characters or mouse clicks, which makes them compatible with most switch-accessible interfaces. They provide a better solution to the standardization problem than the current solution where a switch driver and a traditional switch are required to purchase. In other words, they are able to both detect a head-gesture like a traditional switch and allow the users to assign the expected keyboard characters or mouse clicks—which will be sent to the switch-accessible interface—like a switch driver.

3.2 The user interface

Fig. 1
figure 1

a The initial state of the interface of both software switches. b Rotational movements of a head

Fig. 2
figure 2

Six different intersection states of the interface according to rotational movements of a head

Fig. 3
figure 3

Steps of head tracking algorithm: a take video frames via camera; b apply Euclidean color filter for each frame; c convert video frames to gray-scale; d detect all objects in each frame by connected component labeling method; e choose the greatest object for each frame; f track the position of the greatest object

Fig. 4
figure 4

The placement of the smartphone on the user’s head for HeadGyro software switch

We designed an interface as shown in Fig. 1a which was employed for both software switches. Gamification techniques were applied to make software switches more engaging and fun. An initial state of the interface—where the user has a stable head position—can be seen in Fig. 1a. The interface includes three dynamic game elements: (1) the earth, (2) the left and (3) the right red border lines. All three elements can be controlled by the user’s head movements called pitch, yaw and roll (Fig. 1b). Sensitivity to control the game elements can be set according to the user’s head control capability. As the sensitivity level gets higher, the user can move the game elements with a slower and minor head movement. The mission of the game is to save the earth from the gravity of a black hole by moving these three game elements until the earth intersects with the red border lines. Switch press and switch release are emulated according to this intersection situation. In other words, as soon as the earth intersects with the red border lines, a switch press is emulated until the end of intersection, while a switch release is emulated once the intersection between the earth and the red border lines is terminated. The intersection (i.e., switch press) is followed by a visual or an auditory sensory feedback provided to the user. In order to calibrate the earth’s position, we simulated a gravity function that pulls the earth toward the black hole constantly. The gravity function becomes ineffective during intersection (i.e., switch press). Once the intersection is over (i.e., switch release), the gravity function is reactivated. In this way, if the user keeps his/her head stable for a while when there is not any intersection, the earth will be pulled to its initial position eventually by gravity (i.e., to the center). As illustrated in Fig. 2, each of six different head-gestures (i.e., rotational movements of the head) results in six different intersection states. While pitch (Fig. 2a) and yaw (Fig. 2b) movements control the earth’s position, roll movements (Fig. 2c) operate the position of the right and the left red border lines.

3.3 HeadCam

HeadCam is based on a real-time video motion tracking algorithm which is similar with the study by Esiyok et al. [6]. In principle, the user’s head is tracked by a built-in camera or a standard web-cam to translate the roll movements of the user’s head (as can be seen in Fig. 2c) captured by the camera into an emulated switch press. Before launching HeadCam application, in the configuration step, the user assigns the color of the tracked object through a RGB (red, green, blue) sphere with specified radius for Euclidean color filtering. The algorithm of HeadCam is listed step-by-step below:

  • Video frames are taken by a camera with a frame rate of 15 frames per second and a frame size of \(320\times 240\) pixels (Fig. 3a);

  • Euclidean color filtering is applied for each video frames (Fig. 3b). By this way, Euclidean color filtering filters the colors outside of the RGB sphere with specified center and radius which are assigned at configuration step. In other words, it keeps the pixel within the specified color sphere and fills the other remaining pixels with the black color;

  • Following Euclidean color filtering, video frames are converted to gray-scale images (Fig. 3c);

  • All objects are detected in video frames through the Connected Component Labeling (CCL) method which groups together pixels belonging to the same connected component and treats them as separate objects. Following object detection, for each object, a rectangle is drawn according to the edge of the object (Fig. 3d);

  • The greatest object (i.e., the one whose rectangle has the largest area) is chosen if there is more than one object detected (Fig. 3e);

  • The center point of the rectangle of the greatest object is tracked in real-time on the frame (Fig. 3f);

  • Every motion of the greatest object (i.e., center point of the rectangle) is transformed into the motion of the right or left red border lines as it is depicted in Fig. 2c;

  • Once the earth intersects with the red border lines, a switch press is emulated.

An image processing library called AForge.NET was employed for filtering (Euclidean color filtering) and object detection (CCL). HeadCam is compatible with Windows-based operating systems and was developed under .NET 4.5 framework. Two roll movements of the user’s head (right and left head tilts) can be easily recognized by HeadCam, which makes our software switch capable of supporting double switch inputs for switch-accessible interfaces.

3.4 HeadGyro

HeadGyro interaction technique, basically, employs 3-axis gyroscope data of a smartphone—where the smartphone is placed on the user’s head—to convert the rotational movements of the user’s head into emulated switch presses. The smartphone can be placed on the user’s head in several ways. For example, the user can wear a cap which is attached to the smartphone or a modified belt holding the smartphone as can be seen in Fig. 4. The gyroscope is an important inertial sensor and mainly used to measure angular velocity of the sensor in inertial space. In other words, it measures the rate of change of the sensor’s orientation. Today, inertial sensors like gyroscope are based on microelectromechanical system (MEMS) technology. They are employed in modern smartphones frequently since they are small, cheap, light, and offer low power consumption. In spite of all these advantages, because of the electromagnetic interference and the influence of semiconductor thermal noise, MEMS based solutions might suffer from noise, which affects the accuracy of the detected angular velocity. We preferred the Kalman filter, which is a frequently used method in literature [56,57,58,59] for gyroscope data considering the real-time requirements, to avoid the noise. We also developed a mobile application depending on the Android operating system—which communicates with the computer in a wireless local area network (WLAN)—to convey the stream 3-axis gyroscope data to the computer. As can be seen in Fig. 1, roll, pitch, and yaw movements are represented by the angular velocity around each 3-axis of coordinate system as X, Y, and Z, respectively. The algorithm behind HeadGyro is briefly described step-by-step below:

  • Real-time angular velocity data originated from smartphone’s 3-axis gyroscope sensor is drawn by our Android application;

  • The Android application streams this gyroscope data, which holds three different angular velocity measurements from 3-axes (X, Y, Z), wirelessly to HeadGyro software switch running on computer;

  • For each channel (X, Y, Z), the Kalman filter is applied to reduce the noise as shown in Fig. 5;

  • Every motion of the user’s head is recognized according to filtered angular velocity measurements from 3-axes in HeadGyro, and these measurements are converted into the motion of the game elements as illustrated in Fig. 2. For example, if the angular velocity originated from z-axis is measured as a positive value, then the earth moves to the left side relatively; while it moves to the right side if the measured angular velocity value is negative;

  • Once the earth intersects with the red border lines, a switch press is emulated.

For Kalman fiter, we employed the MathNet.Filtering library. Like HeadCam software switch, HeadGyro is also compatible with Windows-based operating systems, and it was developed in the .NET 4.5 framework. It can provide up to six switch inputs for switch-accessible interfaces, since all six rotational head movements can be easily detected by HeadGyro.

Fig. 5
figure 5

Two different stream data graphs based on the x-axis of the gyroscope sensor of two different participants when participants nod head. Blue and red lines represent unfiltered and Kalman filtered gyroscope data, respectively

4 Evaluation

A usability study was conducted to collect objective and subjective data. In this section, firstly we introduce the characteristics of participants. Then, we present the apparatus used within this study. Afterward, we briefly explain the SITbench 1.0 and the procedure applied during the evaluation of HeadCam and HeadGyro. At last, we conclude the section with our experimental findings.

4.1 Participants

Following the approval by the Ethics Committee of the Izmir Katip Celebi University (Turkey) on 10.10.2018 (decision number: 332), the usability study was conducted at Medical Faculty of the University (Turkey). All participants gave their informed consent before they participated in the study. Consent for publication of human images in this article was also received. A total of 36 participants, including 18 females and 18 males, took part in the evaluation of the proposed systems. While the disability group (DG) comprises 18 participants with motor-disabilities whose ages ranged between 18 and 68, the control group (CG) without disabilities includes 18 people (12 females, 6 males) whose ages ranged between 18 and 59.

In Table 1, age statistics of all participants are summarized according to groups. We also summarize the main characteristics of all participants in Table 2. As an inclusion criteria, all voluntary participants in DG had several difficulties controlling their heads and thus could not operate a computer with conventional ways (i.e., with a mouse and a keyboard). They were all under medical treatment for several motor disabilities, while the experiments were conducted. On the other hand, voluntary participants of CG were generally accompanies of DG or staff working at the Physical Medicine and Rehabilitation Department. All participants met the following inclusion criteria: they were able to (1) find a target on the screen; (2) follow a moving target; (3) maintain gaze on a stable target; (4) stay focused on tests during experiments. All participants in DG had difficulties to control their hands. Besides, there were five participants in DG with reduced head control. We also applied the mini-mental state examination (MMSE)—30-point questionnaire for cognitive assessment—to validate whether the participants can meet the cognitive ability to complete our tests.

Table 1 Age statistics of the participants according to groups
Table 2 Main characteristics of the participants

4.2 Apparatus

A laptop computer (Lenovo G505S; CPU: AMD A8-4500M 1.9 GHz; RAM: 6 GB DDR3; screen: LCD 15.6; OS: Windows 10 64 bits; resolution: \(1600 \times 900\)), an integrated camera (max digital video resolution: \(1280\times 720\); Image Sensor Type: 0.3 MP CMOS), and a smartphone with a gyroscope sensor (Sony Xperia XZ1 Compact; CPU: Qualcomm Snapdragon 835; RAM: 4GB; OS: Android Oreo 8.0) were employed for the experiments.

4.3 The SITbench 1.0 benchmark

We used the SITbench 1.0 [7] benchmark which helps researchers to evaluate switch-based systems objectively. By means of this tool, objective evaluation data can be collected and saved automatically with standardized tests. To this end, we employed the Tie-Smiley Matching Game (TSMG) and Hungry Frog Game (HFG) tests of the SITbench 1.0.

4.3.1 TSMG

Briefly, TSMG is a switch-accessible interface based on the automatic linear scanning method where each smiley is highlighted one-by-one for a given scan time. It includes five different templates. As can be seen in Fig. 6, the scanning array of each template consists of 26 smileys in total. Count and order of red and yellow smileys differ for each template. As an indirect selection, the user activates the switch when the highlighted smiley is the red one. A click sound is also provided to the user as an auditory prompt once the target red smiley is highlighted. The mission of the game is to match each smiley with a tie of the same color (i.e., red to red, yellow to yellow). To achieve this, the user activates the switch only if the highlighted smiley is the red one. Figure 6 shows a sample view after the user completed a trial. Confusion matrix variables as true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) are counted automatically. Then all performance metrics as accuracy, precision, recall and false-positive rate are calculated by the SITbench 1.0 according to the following formulas:

$$ {\text{accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}} $$
(1)
$$ {\text{precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}} $$
(2)
$$ {\text{recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}} $$
(3)
$$ {\text{false}}\;{\text{positive}}\;{\text{rate}} = \frac{{{\text{FP}}}}{{{\text{FP}} + {\text{TN}}}} $$
(4)
Fig. 6
figure 6

A general view of TSMG following a user performance with several mistakes (i.e., with false negatives and false positives)

4.3.2 HFG

HFG is the other single-switch-accessible test of the SITbench 1.0 (Fig. 7). In a nutshell, a trial includes ten tasks, and each task is achieved in a way that (a) the user does not move until a fly appears on the screen, (b) the user activates the switch as fast as possible once the fly is appeared, (c) a frog eats the fly when the switch is activated. After ten tasks of a trial are completed, the SITbench 1.0 measures the following six evaluation metrics automatically: (1) average press time of all ten tasks, i.e., the average time from when the fly appears to when the switch is pressed; (2) average release time of all ten tasks, i.e., the average time from when the switch is pressed until it is released; (3) the fastest press time within ten tasks; (4) the slowest press time within ten tasks; (5) the fastest release time within ten tasks; (6) the slowest release time within ten tasks. HFG includes five different scenarios. For each scenario of HFG, waiting times (i.e., the time between when the user starts to wait the appearance of a fly and when the fly appears on the screen) differ.

Fig. 7
figure 7

A view of HFG in the end of a trial

4.4 The SUS questionnaire

The SUS questionnaire [8], which is an industry standard, consists of ten statements with a five-point Likert scale as can be seen in Table 3. Scale values range from 1 to 5 (1 = strongly disagree, 2 = disagree, 3 = neither agree nor disagree, 4 = agree, 5 = strongly agree). A SUS score (ranging from 0 to 100) is calculated based on scale value of the statements in a way that: (1) score contributions of each statement are summed where the score contribution is the scale value minus 1 for statements 1, 3, 5, 7, 9; the score contribution is 5 minus the scale value for statements 2, 4, 6, 8, 10; (2) the sum of the score contributions is multiplied by 2.5 to calculate the SUS score.

4.5 Procedure

At the beginning, the participants were informed about the test verbally. Then, we ensured that the participants and devices were positioned properly. Following a proper positioning, we let them practice the tests (in a counterbalanced order) under our guidance, until they feel confident to start the tests. Afterwards, we applied two tests of the SITbench 1.0 to collect objective data: (1) TSMG: each software switch was tested by each participant (\(n=36\)) with the first three templates of TSMG where scan time was 1000 milliseconds; (2) HFG: each software switch was tested by each participant (\(n=36\)) with the first three scenarios of HFG.

We applied the tests in the counterbalanced order to avoid learning and repetition effects. In order to prevent mental or physical fatigue, we allowed the participants to get rest up to 5 min between the experiments. For each participant, it took 15–30 min to complete the experiments including breaks. We have not observed any fatigue in any period of the experiments. At the end of the SITbench 1.0 experiments, we also applied the SUS questionnaire to the participants for quantitative subjective evaluation. Besides, we collected the qualitative subjective data via our observations and participants’ responses of open-ended questions about two software switches proposed within this study.

Table 3 SUS questionnaire statements with average scale values through participant groups

4.6 Objective data based results

As can be seen in Fig. 8, according to the results of the TSMG experiments, HeadGyro demonstrated slightly better performance than HeadCam in all performance evaluation metrics (accuracy, precision, recall, and false-positive rate). In terms of accuracy, the mean value of HeadGyro (\(m=0.938\)) was greater than HeadCam (\(m=0.904\)), and the difference between mean values was found statistically significant (p < 0.05) according to Student’s t-test for both software switches. For precision, HeadGyro (\(m=0.921\)) exhibited better performance than HeadCam (\(m=0.872\)), and there was a significant difference between means (p < 0.05). Regarding recall, HeadGyro (\(m=0.910\)) was followed by HeadCam (\(m=0.863\)) with a significant difference between means (p < 0.05) of both interaction techniques. For false-positive rate, HeadCam (\(m=0.077\)) was ahead of HeadGyro (\(m=0.048\)), and the difference between means was significant (p < 0.05).

Fig. 8
figure 8

Mean values of interaction techniques acquired from all participants through evaluation metrics of TSMG including accuracy, precision, recall, and false positive rate (*p < 0.05)

Figure 9 presents the mean values of each software switch for TSMG depending on the participant groups (Mix, DG, CG). CG members performed better than DG members for both software switches according to the mean values through accuracy, precision and recall evaluation metrics. In false positive rate score, DG had higher scores than CG for software switches, which means that DG members made false selections more frequently when compared to CG members. The Student’s t-tests for both interaction techniques through all evaluation metrics was applied to check whether there is a significant difference between the performance of DG members and CG members. The difference between means between DG and CG was not significant for all metrics.

Fig. 9
figure 9

Mean values of the software switches through evaluation metrics (accuracy, precision, recall, false positive rate) according to the participant groups (Mix, DG, CG)

Likewise, HeadGyro proved a better performance in comparison to HeadCam for all HFG evaluation metrics (Fig. 10) (average press time, average release time, the fastest press time, the slowest press time, the fastest release time, and the slowest release time). Mean and p-values of both interaction techniques are presented in Table 4 based on HFG experiments. According to p-values based on the Student’s t-test results of all participants for both interaction techniques, it is demonstrated that there is a statistically significant difference between the means of HeadGyro and HeadCam through all evaluation metrics.

Fig. 10
figure 10

Mean values of two software switches for all participants through HFG evaluation metrics (average press time, the fastest press time, the slowest press time, average release time, the fastest release time, and the slowest release time) (*p < 0.05; **p < 0.01)

Table 4 Mean values of HeadGyro and HeadCam through HFG evaluation metrics (average press time, the fastest press time, the slowest press time, average release time, the fastest release time, and the slowest release time) for all participants

4.7 Subjective data based results

Results of the SUS questionnaire as quantitative subjective data are listed in Table 3. The average scale values acquired from all participants are represented according to HeadGyro and HeadCam through participant groups as mix, DG, and CG. For mix group, the average SUS scores are calculated as 85.0 and 87,9 for HeadGyro and HeadCam, respectively. In DG, the average SUS score is 85,3 for HeadGyro, while it is 87.8 for HeadCam. On the other hand, in CG, the average SUS score is calculated as 84.7 for HeadGyro, while it is calculated as 88.0 for HeadCam. According to the SUS adjective rating scale [60], all SUS scores can be considered as excellent. After the experiments, all participants agreed that both proposed interaction techniques are promising solutions for computer access tasks. They also declared that they were looking forward to experience both software switches to control a computer. Regarding to experiments with the SITbench 1.0, five participants stated that they would perform better if the scanning speed of the TSMG test was set to a slower value, while four participants suggested to increase the size of smileys. All participants were pleased with the visual and auditory sensory feedback provided to the user during tests once the switch is activated or the target is appeared. While 31 of all participants declared that they would prefer to use HeadCam for computer access, 5 of them chose HeadGyro as their favorite software switch. They all agreed that gamification techniques made software switches more engaging. None of the participants experienced any fatigue during tests.

5 Conclusion and discussion

Hands-free computer access via head movements is already a challenging task in comparison to conventional ways, but when it comes to people with limited head control, computer access becomes a more challenging task since the users are obliged to interact with a computer by a single head-gesture like a head nod or a head tilt. On the other hand, the high-cost of dedicated devices—employed by the majority of current head-operated HCI solutions—creates a new barrier, although the aim of universal access is to break the barriers to enable equal opportunity and access for people with disabilities.

Alternative computer access methods can provide several useful services for people with motor disabilities in every part of life, such as communication and education. Any new interaction techniques enabling computer access with minimal head movements will obviously help to enhance the quality of life and the self-sufficiency of people with reduced head control ability alone. Therefore, we proposed two novel interaction techniques namely HeadGyro and HeadCam which depend on the gyroscope sensor of a smartphone and a standard camera, respectively. Both interaction techniques are based on our software switch approach that provides a comprehensive solution to the following problems of the current single head-gesture based interaction techniques: (1) requirement of dedicated devices and (2) compatibility with switch-accessible interfaces. In accordance with the two principles of our software switch approach, HeadGyro and HeadCam software switches (1) do not require any dedicated devices and (2) are configurable to be compatible with switch accessible interfaces. In a nutshell, both software switches can serve like traditional switches by recognizing head movements via a standard camera or a gyroscope sensor of a smartphone to transform them into virtual switch presses.

According to the evaluation data of the conducted usability study with 36 participants (18 motor-impaired, 18 able-bodied), HeadGyro showed slightly better performance than HeadCam in objective evaluation, while HeadCam was rated better than HeadGyro in subjective evaluation. Furthermore, 31 of all participants declared that they would prefer to use HeadCam for computer access, while 5 of them selected HeadGyro. Based on our observations, the reasons behind this situation are as follows: (1) The head control ability is the key factor for this situation. Those who have complete head control ability (31 participants) rated HeadCam, while the ones with reduced head control (5 participants) preferred HeadGyro since it is more sensitive and thus capable of recognizing tiny head movements; (2) Those with complete head control can easily activate the software switch via a standard camera. As expected, wearing a smartphone on head was found an unnecessary solution by the participants as long as their head control capability remains unimpaired or their head movements can be detected by HeadCam. However, HeadGyro can be advantageous if (1) the users cannot move their head enough to be recognized by a camera, or (2) the external factors (e.g., low/high light or any moving object behind the user) cannot be tolerated by camera-based tracking. As can be concluded from the results of objective evaluation, HeadGyro works in a more sensitive way in comparison to HeadCam.

Both software switches can serve as the only low-cost options for people with limited head control who cannot afford the systems depending on high-cost dedicated devices. Beyond head motions, proposed software switches can be quite flexible by recognizing the other body motions to transform them into emulated switch presses. This flexibility also permits the user to change the targeted body motion once the user becomes tired. On the other hand, proposed software switches can also be employed by multi-modal systems as new input techniques beyond the assistive technology area (e.g., as a new input for a computer video game). As another application domain, HeadGyro software switch might be preferred during outdoor activities, since it is quite durable against external factors like low light, high noise, and air conditions. As a future work, any other physical gesture—which is well-controlled by the user—can be targeted to evaluate the efficiency and usability of the proposed interaction techniques. Both software switches can also be employed by a single-switch accessible CAT to see their performance in a real-life scenario.