Usability Methods and Attributes Reported in Usability Studies of Mobile Apps for Health Care Education: Scoping Review

Background: Mobile devices can provide extendable learning environments in higher education and motivate students to engage in adaptive and collaborative learning. Developers must design mobile apps that are practical, effective, and easy to use, and usability testing is essential for understanding how mobile apps meet users’ needs. No previous reviews have investigated the usability of mobile apps developed for health care education. Objective: The aim of this scoping review is to identify usability methods and attributes in usability studies of mobile apps for health care education. Methods: A comprehensive search was carried out in 10 databases, reference lists, and gray literature. Studies were included if they dealt with health care students and usability of mobile apps for learning. Frequencies and percentages were used to present the nominal data, together with tables and graphical illustrations. Examples include a figure of the study selection process, an illustration of the frequency of inquiry usability evaluation and data collection methods, and an overview of the distribution of the identified usability attributes. We followed the Arksey and O’Malley framework for scoping reviews. Results: Our scoping review collated 88 articles involving 98 studies, mainly related to medical and nursing students. The studies were conducted from 22 countries and were published between 2008 and 2021. Field testing was the main usability experiment used, and the usability evaluation methods were either inquiry-based or based on user testing. Inquiry methods were predominantly used: 1-group design (46/98,


Background
Mobile devices can provide extendable learning environments and motivate students to engage in adaptive and collaborative learning [1,2]. Mobile devices offer various functions, enable convenient access, and support the ability to share information with other learners and teachers [3]. Most students own a mobile phone, which makes mobile learning easily accessible [4]. However, there are some challenges associated with mobile devices in learning situations, such as small screen sizes, connectivity problems, and multiple distractions in the environment [5].
Developers of mobile learning apps need to consider usability to ensure that apps are practical, effective, and easy to use [1] and to ascertain that mobile apps meet users' needs [6]. According to the International Organization for Standardization, usability is defined as "the extent to which a system, product or service can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use" [7]. Better mobile learning usability will be achieved by focusing on user-centered design and attention to context, ensuring that the technology corresponds to the user's requirements and putting the user at the center of the process [8,9]. In addition, it is necessary to be conscious of the interrelatedness between usability and pedagogical design [9].
A variety of usability evaluation methods exists to test the usability of mobile apps, and Weichbroth [10] categorized them into the following 4 categories: inquiry, user testing, inspection, and analytical modeling. Inquiry methods are designed to gather data from users through questionnaires (quantitative data) and interviews and focus groups (qualitative data). User testing methods include think-aloud protocols, question-asking protocols, performance measurements, log analysis, eye tracking, and remote testing. Inspection methods, in contrast, involve experts testing apps, heuristic evaluation, cognitive walk-through, perspective-based inspections, and guideline reviews. Analytical modeling methods include cognitive task analysis and task environment analysis [10]. Across these 4 usability evaluation methods, the most commonly used data collection methods are controlled observations and surveys, whereas eye tracking, think-aloud methods, and interviews are applied less often [10].
Usability evaluations are normally performed in a laboratory or in field testing. Previous reviews have reported that usability evaluation methods are mainly conducted in a laboratory, which means in a controlled environment [1,11]. By contrast, field testing is conducted in real-life settings. There are pros and cons to the 2 different approaches. Field testing allows data collection within a dynamic environment, whereas in a laboratory data collection and conditions are easier to control [1]. A variety of data collection methods are appropriate for usability studies; for instance, in laboratories, participants performing predefined tasks, such as using questionnaires and observations, are often applied [1]. In field testing, logging mechanisms and diaries have been applied to capture user interaction with mobile apps [1].
In all, 2 systematic reviews examined various psychometrically tested usability questionnaires as a means of enhancing the usability of apps. Sousa and Lopez [12] identified 15 such questionnaires and Sure [13] identified 13. In all, 5 of the questionnaires have proven to be applicable in usability studies in general: the System Usability Scale (SUS), Questionnaire for User Interaction Satisfaction, After-Scenario Questionnaire, Post-Study System Usability Questionnaire, and Computer System Usability Questionnaire [12]. The SUS questionnaire and After-Scenario Questionnaire are most widely applied [13]. The most frequently reported usability attributes of these 5 questionnaires are learnability, efficiency, and satisfaction [12].
Usability attributes are features that measure the quality of mobile apps [1]. The most commonly reported usability attributes are effectiveness, efficiency, and satisfaction [5], which are part of the usability definition [7]. In the review by Weichbroth [10], 75 different usability attributes were identified. Given the wide selection of usability attributes, choosing appropriate attributes depends on the nature of the technology and the research question in the usability study [14]. Kumar and Mohite [1] recommended that researchers present and explain which usability attributes are being tested when mobile apps are being developed.
Previous reviews have examined the usability of mobile apps in general [5,10,11,14,15]; however, only one systematic review has specifically explored the usability of mobile learning apps [1]. However, studies from health care education were not included. Similarly, usability has not been widely explored in medical education apps [16]. Thus, there is a need to develop a better understanding of how the usability of mobile learning apps developed for health care education has been evaluated and conceptualized in previous studies.

Objectives
The aim of this scoping review has therefore been to identify usability methods and attributes in usability studies of mobile apps for health care education.

Framework
We have used the framework for scoping reviews developed by Arksey and O'Malley [17] and further developed by Levac et al [18] and Khalil et al [19]. We adopted the following five stages of this framework: (1) identifying the research question, (2) identifying relevant studies, (3) selecting studies, (4) charting the data, and (5) summarizing and reporting the results [17][18][19].
A detailed presentation of each step can be found in the published protocol for this scoping review [20]. We followed the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist for reporting scoping reviews (Multimedia Appendix 1 [21]).

Stage 1: Identifying the Research Question
The following two research questions have been formulated:

Stage 3: Selecting Studies
Two of the authors independently screened titles and abstracts using Rayyan web-based management software [22]. Studies deemed eligible by one of the authors were included for full-text screening and imported into the EndNote X9 (Clarivate) reference management system [23]. Eligibility for full-text screening was determined independently by two of the authors and disagreements were resolved by consensus-based discussions. Research articles with different designs were included, and there were no language restrictions. As mobile apps started appearing in 2008, this year was set as the starting point for the search. Eligibility criteria are presented in Table  1.

Stage 4: Charting the Data (Data Abstraction)
The extracted data included information about the study (eg, authors, year of publication, title, and country), population (eg, number of participants), concepts (usability methods, usability attributes, and usability phase), and context (educational setting). The final data extraction sheet can be found in Multimedia Appendix 3 . One review author extracted the data from the included studies using Microsoft Excel software [21], which was checked by another researcher.
Descriptions of usability attributes have not been standardized, making categorization challenging. Therefore, a review author used deductive analysis to interpret the usability attributes reported in the included studies. This interpretation was based on a review of usability attributes as defined in previous literature. These definitions were assessed on the basis of the results of the included studies. This analysis was reviewed and discussed by another author. Disagreements were resolved through a consensus-based discussion.

Stage 5: Summarizing and Reporting the Results
Frequencies and percentages were used to present nominal data, together with tables and graphical illustrations. For instance, a figure showing the study selection process, an illustration of the frequency of inquiry-based usability evaluation and data collection methods, and an overview of the distribution of identified usability attributes were provided.

Eligible Studies
Database searches yielded 34,369 records, and 2796 records were identified using other methods. After removing duplicates, 28,702 records remained. A total of 626 reports were examined in full text. In all, 88 articles were included in the scoping review   (Figure 1). A total of 8 articles comprised results from several studies in the same article, presented as study A, study B, or study C in Multimedia Appendix 3. Therefore, a total of 98 studies were reported in the 88 articles included. The included studies comprised a total sample population of 7790, with participant numbers ranging from 5 to 736 participants per study. Most of the studies included medical students (34/88, 39%) or nursing students (25/88, 28%). Other participants included students from the following disciplines: pharmacy (9/88, 10%), dentistry (5/88, 6%), physiotherapy (5/88, 6%), health sciences (3/88, 3%), and psychology (2/88, 2%). Further information is provided in Multimedia Appendix 3. There were 22 publishing countries, with most studies being from the United States (22/88, 25%), Spain (9/88, 10%), the United Kingdom (8/88, 9%), Canada (7/88, 8%), and Brazil (7/88, 8%), with an increasing number of publications from 2014. Table 2 provides an overview and characteristics of the included articles.
A total of 19 studies used a psychometrically tested usability questionnaire, including the SUS, Technology Acceptance Model, Technology Satisfaction Questionnaire, and Technology Readiness Index. SUS [112] was used in most (9/98, 9%) of the studies.

Usability Attributes
A total of 17 usability attributes have been identified among the included studies. The most frequently identified attributes were satisfaction, usefulness, ease of use, learning performance, and learnability. The least frequent were errors, cognitive load, comprehensibility, memorability, and simplicity. Table 3 provides an overview of the usability attributes identified in the included studies. Table 3. Distribution of usability attributes (n=17) and affiliated reports (N=88).

Principal Findings
This scoping review sought to identify the usability methods and attributes reported in usability studies of mobile apps for health care education. A total of 88 articles, with a total of 98 studies reported in these 88 articles, were included in this review. Our findings indicate a steady increase in publications from 2014, with studies being published in 22 different countries. Field testing was used more frequently than laboratory testing. Furthermore, the usability evaluation methods applied were either inquiry-based or based on user testing. Most of the inquiry-based methods were experiments that used questionnaires as a data collection method, and all of the studies with user testing methods applied think-aloud methods. Satisfaction, usefulness, ease of use, learning performance, and learnability were the most frequently identified usability attributes.

Usability Evaluation Methods
The studies included in this scoping review mainly applied inquiry-based methods, primarily the collection of self-reported data through questionnaires. This is congruent with the results of Weichbroth [10], in which controlled observations and surveys were the most frequently applied methods. Asking users to respond to a usability questionnaire may provide relevant and valuable information. Among the 83 studies that used questionnaires in our review, only 19 (23%) used a psychometrically tested usability questionnaire; of these, the SUS questionnaire [112] was used most frequently. In line with the review on usability questionnaires [12], we recommend using a psychometrically tested usability questionnaire to support the advancement of usability science. As questionnaires address only certain usability attributes, mainly learnability, efficiency, and satisfaction [12], it would be helpful to also include additional methods, such as interviews or mixed methods, and to incorporate additional open-ended questions when using questionnaires.
Furthermore, the application of usability evaluation methods other than inquiry methods, such as user testing methods and inspection methods [10], could be beneficial and lead to more objective measures of app usability. Among other things, subjective data are collected via self-reported questionnaires, and objective data are collected based on task completion rates [40]. For example, in one of the included studies, the participants reported that the usability of the app was satisfactory by subjective measures, but the participants did not use the app [75]. Another study reported a lack of coherence between subjective and objective data; thus, these results indicate the importance of not relying solely on subjective measures of usability [40]. Therefore, it is suggested that various usability evaluation methods, including subjective and objective usability measures, are used in future usability studies.
Our review found that most of the included studies in health care education (71/98, 72%) performed field testing, whereas previous literature suggests that usability experiments in other fields are more often conducted in a laboratory [1,113]. For instance, Kumar and Mohite [1] found that 73% of the studies included in their review of mobile learning apps used laboratory testing. Mobile apps in health care education have been developed to support students' learning, on-campus and during clinical placement, in various settings and on the move. Accordingly, it is especially important to test how the apps are perceived in specific environments [5]; hence, field testing is required. However, many usability issues can be discovered in a laboratory. Particularly in the early phases of app development, testing an app with several participants in a laboratory may make it more feasible to test and improve the app [8]. Usability testing in a laboratory can provide rapid feedback on usability issues, which can then be addressed before testing the app in a real-world environment. Therefore, it may be beneficial to conduct small-scale laboratory testing before field testing.

Usability Attributes
Previous systematic reviews of mobile apps in general identified satisfaction, efficiency, and effectiveness as the most common usability attributes [5,10]. In this review, efficiency and effectiveness were explored to a limited extent, whereas satisfaction, usefulness, and ease of use were the most frequently identified usability attributes. Our results coincide with those from a previous review on the usability of mobile learning apps [1], possibly because satisfaction, usefulness, and ease of use are usability attributes of particular importance when examining mobile learning apps.
Learning performance was assessed frequently in the included studies. For ensuring that apps are valuable in a given learning context, it is relevant to test additional usability attributes such as cognitive load [9]. However, few studies included in our review examined cognitive load [68,80,108]. Mobile apps are often used in an environment with multiple distractions, which may contribute to an increased cognitive load [5], affecting the learning performance. Testing both learning performance and app users' cognitive load may improve the understanding of the app's usability.
We found that several of the included studies did not use terminology from usability literature to describe which usability attributes they were testing. For instance, studies that tested satisfaction often used words such as "likes and dislikes" and "recommend use to others" and did not specify that they tested the usability attribute satisfaction. Specifying which usability attributes are investigated will be important when performing a usability study of mobile apps, as this will influence transparency and enable comparison between different studies. In addition, evaluating a wider range of usability attributes may enable researchers to expand their perspective regarding the app's usability problems and ensure quicker improvement of the app. Defining and presenting different usability attributes in a reporting guideline can assist in deciding on and reporting relevant usability attributes. As such, a reporting guideline would be beneficial for researchers planning and conducting usability studies, a point that is also supported by the systematic review conducted by Kumar and Mohite [1].

Future Directions
Combining different usability evaluation methods that incorporate both subjective and objective usability measures can add various and important perspectives when developing apps. In future studies, it would be advantageous to use psychometrically tested usability questionnaires to support the advancement of the usability science. In addition, developers of mobile apps should determine which usability attributes are relevant before conducting usability studies (eg, by registering a protocol). Incorporating these perspectives into the development of a reporting guideline would be beneficial to future usability studies.

Strengths and Limitations
First, the search strategy was designed in collaboration with a research librarian and peer reviewed by another research librarian and included 10 databases and other sources. This broad search strategy resulted in a high number of references, which may be associated with a lower level of precision. To ensure the retrieval of all potentially pertinent articles, two of the authors independently screened titles and abstracts; studies deemed eligible by one of the authors were included for full-text screening.
Second, the full-text evaluation was challenging because the term usability has multiple meanings that do not always relate to usability testing. For instance, the term was used when testing students' experience of a commercially developed app but not in connection with the app's further development. In addition, many studies did not explicitly state that a mobile app was being investigated, which also created a challenge when deciding whether they satisfied the eligibility criteria. Nevertheless, reading the full-text articles independently by 2 reviewers and solving disagreements through consensus-based discussions ensured the inclusion of relevant articles.

Conclusions
This scoping review was performed to provide an overview of the usability methods used and the attributes identified in usability studies of mobile apps in health care education. Experimental designs were commonly used to evaluate usability and most studies used field testing. Questionnaires were frequently used for data collection, although few studies used psychometrically tested questionnaires. Usability attributes identified most often were satisfaction, usefulness, and ease of use. The results indicate that combining different usability evaluation methods, incorporating both subjective and objective usability measures, and specifying which usability attributes to test seem advantageous. The results can support the planning and conduct of future usability studies of the advancement of learning apps in health care education.