A Design Lenses-based process to evaluate interfaces for mobile devices

The present work focuses on proposing an evaluation process focused on interfaces for mobile devices. In this way, interviews were performed to understand how evaluation processes are considered within the routines of professionals who deal with mobile devices. The proposed process has a set of associated evaluation lenses. To facilitate access to the process, an application was developed, and it is presented at the end of this paper. Tests were performed to understand the applicability of the proposed process. The results suggest that experts using the proposed process are able to identify problems and that the process itself is seen as useful in professional and educational settings. It is proposed that in the future the process is subjected to tests that include the evaluation and redesign of the evaluated interface. In this way, it would be possible to observe how the conclusions and reflections awakened in the specialists throughout the evaluation process can impact the redesign of interfaces.


Introduction
Emerging countries are undergoing a digital transformation, where smartphones are becoming the main device for accessing the internet. Between the years 2013 and 2015, the percentage of adults who reported using a smartphone grew from 45% to 54%. The main responsibility to the increase are Malaysia, China, and Brazil [1]. In 2018 in Brazil, 98.1% of Brazilians reported using smartphones to access the Internet [2].
For mobile devices (tablets, smartphones, or wearable devices) there are so-called applications, or apps, which are not only small versions of common software, but they could have some specific features, like geo-localization and a wide range of sharing methods. Specificities of the devices imply limitations imposed on the development of these applications, requiring that the apps be developed to work quickly and easily, without requiring much from the user. Small physical sizes, limited memory, and several methods of data entry are some important aspects to be considered during the mobile development process [3].
In this context, it is important to create applications with easy-to-learn interfaces, effective in use and provide a pleasant user experience [4]. In order to verify that the general interfaces fulfill these objectives, processes that aim to validate these criteria are proposed, usually with sets of heuristics that help the specialist identify problems. However, traditional evaluation methods do not consider the particularities of touchscreen-based mobile devices. This leads to the need for new techniques for an accurate evaluation of interfaces in this paradigm [5].
Given this context, the current work proposes an interface evaluation process for mobile devices that considers the specificities of these devices. This paper was divided into four parts and organized as follows: Item 2 discusses the related works; Item 3 discusses methods used in other items; The next section describes the result of the interviews and the phenomenological analysis of the data collected in the interviews; Item 5 presents and discusses the proposed process, based on the results of the previous item; The paper finishes by presenting some conclusions about the results of this research and some directions for further work.

Related Works
Regarding the applicability of heuristics in the context of mobile interfaces, some studies that describe specific sets for this context were studied and they will be discussed in the next paragraphs.
A heuristic evaluation framework was proposed by [6] to analyze applications for mobile learning devices, based on the collection and analysis of articles with this theme in order to identify the best practices described in the literature. In order to show feasibility, the framework was applied in practice, with evaluations of mobile learning applications being executed. As a result, the authors find that the framework is efficient in guiding specialists during the evaluation.
In [7], the authors tried to identify problems related to the usability of games on mobile devices. As a result, they propose a set of new heuristics with an emphasis on this context, with the purpose of improving the usability of games for mobile devices. To develop the set of heuristics, the authors collected reports from users about the experience of playing a car racing game and from those categories they were created and generalized in the form of heuristics. For validation, the tested game was redesigned, based on the proposed heuristics, and the same sample as the first stage of the research was presented. It was concluded that the redesign of the game according to the heuristics represented a significant impact on the design.
In [8], the authors executed a systematic review in order to identify heuristics and usability metrics used in the literature and industry, and thus propose a set of usability heuristics aimed at mobile applications. The result presents a set of thirteen heuristics that consider the tasks performed by the user, the context in which they are performed, and the cognitive load as important attributes of usability. One of the suggested future works is the evaluation of applications that target people with disabilities, in order to obtain data to extend the model considering the range of social and cultural situations that mobile applications can be inserted into.
Another inherent criterion for evaluating these interfaces is the specific guidelines of each mobile device operating system. These guidelines add yet another layer of knowledge required from professionals, as they are extensive documentation covering the proper look and feel for each platform. Topics such as accessibility, user experiences, and aesthetic standards are also relevant layers of knowledge in the planning process of these interfaces [9].

Methods
The research steps were as follows: Data collection; Analysis of the collected data; Synthesis of the findings; and Development of the proposal based on the knowledge acquired.
The data collection consisted of a semi-structured interview with professionals familiar with the development of applications for mobile devices, in order to understand how the evaluation of interfaces is handled by different contexts and professionals. The semi-structured format was chosen because it allows the insertion of relevant questions depending on the context, exploring the experience of the specialists interviewed in a specific way [10].
The analysis of the data collected in the interviews occurred following a phenomenological approach, in order to understand the phenomena reported in the experience of each professional [11].
The conclusion of the phenomenological analysis process goes through the stage of synthesis of the findings, the objective is to integrate all the insights into units of knowledge useful to the research and consistent with the reports of the phenomenon. This step guided the decisions taken to idealize the proposal for the interface evaluation process with an emphasis on mobile devices.

Interviews
In total, ten interviews were carried out, respecting the phenomenological rule of conducting interviews until the moment it is perceived that the reports are repeated; or until it is understood that there are enough reports to understand the essence of the phenomenon [11]. A convenience sample was used to select the people to be interviewed. This ensured that the interviewees were familiar with the research cut, and potentially could provide relevant data to achieve the intended objectives. Because of this, the group of interviewees was made up of professionals who deal with mobile devices, dealing with designers and developers, at different levels, from the beginner to the specialist. This diversity seeks to represent the different levels of contact that these professionals may have with development processes, including their practices and addictions.
The interviews were conducted via a videoconferencing tool. Each interview lasted approximately 45 minutes. The data were recorded in text format, literally transcribing what the interviewee reported. The records were made without characterizing the interviewee, removing names of people, companies or institutions. The Notion was used to enable the recording, storage and further analysis of these data. In total, four developers, four designers and two educators were interviewed.

Phenomenological analysis
With the conclusion of the interviews, the phenomenological analysis of the data began. This process aims to identify the essential nucleus of the phenomenon, for it is necessary first to detach from previous beliefs to understand the desired phenomenon [12].
The rigor of the data documentation is relevant to the process, for this reason, it was decided to transcribe the statements simultaneously during the interviews so that unwanted interpretations would not occur during the collection. This transcription process also accelerated the analysis stage, since the data was already available for reading and reviewing immediately after the interviews.
The phenomenological analysis followed the steps according to [13]. The first stage comprised the initial reading, with the objective of creating familiarity and understanding the interviewee's language, but with no intention of reaching conclusions. Subsequently was performed a new reading, with the objective of identifying units of meaning. This stage identified ninety-one units of meaning relevant to the understanding of the phenomenon, which were first highlighted and later organized in a table to facilitate the manipulation of the data in the subsequent phases.
The third step is a time to look at the units previously identified to group them into categories. These categories group similar knowledge that together helps to understand specific aspects of the phenomenon. From the units of meaning highlighted in the previous step, eight categories were developed. The categories created were: Accessibility, Processes, Programmers, Designers, Educators, Quality, Investment and Evaluation.
The categories of Programmers, Designers and Educators comprises reports inherent to work positions, featuring units of meanings that describe the role of these professionals in the evaluation processes. The Accessibility category comprises descriptions that help to understand aspects related to how accessibility is treated during development processes. Processes is the category that comprises the described rites performed by professionals in order to evaluate and apply improvements to the interface. Evaluation and Quality comprise the units of meaning that comprise reports that speak respectively of evaluation and interface quality directly. The Investment category groups reports that describe how financial resources impact the interface design and accessibility efforts.
The last moment is focused on synthesizing the findings, and producing descriptions that express the understood concepts related to the investigated phenomenon. Transcription of the reports to the common language of the researchers was performed, eliminating information that is not relevant to the focus of the investigation. So finally, it was possible to carry out the synthesis by grouping the common aspects among the reports. The main conclusions of the phenomenological analysis include: • Most of the reported processes are flexible, without pre-defined rules and documentation standardization.
• Usability testing with real users is often reported as the central testing process. But at least, reports say that these tests are procrastinated because they do not fit into the work routines.
• Teachers seek to demonstrate the relevance of evaluating and criticizing the interface, regardless of which process is used.
• Complex and lengthy text are ignored.
• Collaborative work between designers contributes to a critical view of the project.
• There is an expectation that the designer is the professional responsible for considering the accessibility aspects of the interface design.
• Developers also participate in discussions to criticize and improve the interface. Usually, these processes are based on previous opinions or experiences.
• Developers participate in the ideation stage in order to assess the technical feasibility of the solution.
• Products are constantly remade without quality analysis.
• Criticism without parameters can generate a feeling of dissatisfaction.
The reports also exposed the context that small companies hardly invest in accessibility. Programmers often participate in discussions to adjust functionality and consequently interfaces, but the focus of these professionals is to determine which technologies will be used and what is possible to develop within the stipulated period.

Proposed process
Thus, the objective became to propose a process with characteristics of flexibility, applicability and usefulness. The process needs to be flexible to adapt to the different realities and functions of professionals who deal with the development of applications for mobile devices, including programmers, designers and educators.
Applicability refers to the ability to be applied by professionals of different levels of seniority, being useful for new professionals who are adapting the routines and concepts involved in the interface evaluation processes. It was also intended to generate useful material for consultation and use in the teaching process. Efforts have been made to make the usefulness of the process easily understood, including guidance on documentation and tasks that need to be performed to achieve the objectives of the evaluation.

Heuristics
To focus the process on mobile devices and to consider the specific characteristics of these devices, it was decided to use the set of heuristics proposed by [14]. From a systematic mapping with a focus on identifying and categorizing heuristics used in interface evaluation for mobile devices, a set of 65 heuristics divided into 4 categories was developed. The heuristics were divided into Accessibility, Aesthetics, Usability and User Experience, in order to cover all the concepts related to the interface design for these devices [14].
As also addressed by [14] it is possible to identify a pattern in the way the heuristics are made available in the analyzed studies. Formats such as lists, trees or tables are commonly used. In extensive sets of heuristics, these structures do not facilitate the understanding by beginners and tend to make the process more complex, since the query of data is not simplified [14].
Thus, in order to organize the heuristics proposed by [14] in a structure that facilitates the understanding and makes the process flexible, it was decided to use the Design Lenses structure. Design Lenses is a concept used in the design of user experience, but which first appeared within the scope of Game Design [15]. The lenses appear in order to guide the designer's eye. Each lens refers to a perspective, and there is a diverse range of lenses to address the breadth of interactive game design [16].
According to [16], lenses make it possible to perceive problems from several different perspectives. This concept is relevant in the context of the evaluation of interfaces for mobile devices, where we are faced with a complexity of interconnected concepts, due the relevant number of heuristics and the division into categories.
In this way 65 cards were diagrammed, forming a deck with four suits that represent the categories. Each card combined a heuristic with a set of questions, forming the lens [15]. Fig. 1 shows the anatomy of the diagrammed lenses, each category was given a color to facilitate visual distinction, a code to identify each card and an area to relate cards with complementary concepts were added.

Process
To visually represent the process and facilitate the understanding of its activities, steps and documents involved, Business Process Diagram (BPD) was chosen. BPD's goal is to make visual representations of processes viable, so that analysts, developers, entrepreneurs, and others involved in the development of technologies can understand easily. For this, it uses a set of graphic elements and flowchart logic [17]. Fig. 2 represents the BPD of the proposed process and shows the three flows planned as a proposal for using the lenses in an evaluation process. Each flow was designed to demonstrate the versatility that the process has, but there is no intention to limit the possibilities.
However, it is always necessary to start the process by reviewing concepts related to the objective of the project, reviewing information known to the target audience and defining the objectives of the evaluation. This decision was made based on the conclusions obtained in the phenomenological analysis since the absence of formal rites for the evaluation processes performed for the professionals was identified as a standard. In order to demonstrate the relevance of carrying out the evaluation process, the elaboration of an action plan based on the information collected during the evaluation is scored as a final activity.
As a main flow, a complete evaluation should be carried out, going through each lens of each category. This flow was focused on longer and more time-consuming processes, which can occur sporadically, but There are two possible alternatives flows, the first is to use all the cards of only one category. It is advisable to apply this approach when the professional is already aware that the interface is deficient in one of the categories. In this way the process will help the professional to identify arguments, structure the existing problems and define an action plan.
The second alternative flow option is to randomly draw one of the lenses and evaluate only using it. It is a shorter flow, which will not provide a large amount of information, but can assist in the incorporation of a culture of evaluations.
The lenses alone can be used as a review and study material, and can be applied within the academic scope, to introduce the concepts to students in a playful way.

Distribution
The phenomenological analysis also showed that it is common for professionals to report that they do not have knowledge about interface evaluation methodologies, while reporting that they consider that these methodologies can be useful in their work routines. Reports also criticize the way in which these processes commonly do not reach professionals easily, either due to factors related to language, lack of disclosure or absence of easily accessible content. Because of these reports, it was decided to take a first step towards making the results of this research easily available.
Because of that a mobile application was developed, containing all lenses, separated by categories and with an explanation of how to use them. It is also possible to search by title, category or information contained in the body of the card. In Fig. 3, it is possible to observe screens of the developed application. The application is available for smartphones from the Apple ecosystem, on the App Store.

The process tests
Tests were planned with the objective of verifying the applicability of the process in practical contexts. For this, it was decided to carry out the process and collect data on the experience of the participants. In total, 20 tests were performed, and convenience sampling was used, selecting individuals who could represent the researched context within the group of students and mentors of the Apple Developer Academy -Mackenzie and Apple Developer Academy -IFCE project.
Designers, educators, and programmers joined the sample, familiar with mobile app development practices. The tests lasted 45 minutes, were carried out through videoconferencing.
Participants were invited via e-mail and informed about the contours of the tests and research. The informed consent was also sent in advance via e-mail so that the participant could read and decide on participation.
This division made it possible for the explanation of the application of the tests to be performed once per session, simultaneously for all participants. After the explanations, the participants were directed to For this first test, it was decided to apply the version of the process where a card is freely selected by the evaluator and the evaluation is performed only according to the heuristic presented in the card. For this reason, time data will be relevant to create duration estimates for the other assessment flows.
The rites of each session were composed of: • Signatures and acknowledgments: right after making sure that all participants were present, the researcher started the session with acknowledgments and introductions about the research work.
• Presentation of the process: the proposed process was presented, explaining the materials involved and the suggested flows for its application. This explanation was performed verbally, the BPD representation was not used.
• Presentation of the application: once the process and the concept of the heuristics presented in the form of cards were explained, participants were asked to download the developed application Evaluation Lenses. A brief explanation of how the application works was carried out, considering details of the anatomy of each card. The categories, number of cards and explanation screen were also presented.
• Presentation of the documentation sheet: an editable text file was made available to each participant so that documentation could be carried out during the evaluation of the application's interface. The • card drawing: to execute the suggested flow for the evaluation, each participant freely selected a card to guide their evaluation.
• Definition of the application to be evaluated: each participant was asked to evaluate, on their mobile device, the application they use to manage their respective e-mails. With this, it was intended to standardize the familiarity of the participants with the analyzed application, forcing them to observe the interface that they frequently use through the perspective of the selected card.
• Evaluation: at this stage, the participants applied the evaluation individually. Each participant was asked to time the time they used to assess and document. It was suggested that the participant complete the assessment when he felt comfortable doing so or felt that he had explored the entire application interface.
• Application of questionnaires: when each participant reported ending the evaluation, a digital questionnaire was sent. The instrument used can be found in Appendices. The Airtable database tool was used to enable the digital questionnaire. It was considered mostly questions of levels of agreement and two open questions.
For the questionnaire, statements were made to understand the participants' perceptions about the material being presented, the process and the heuristics in card format. For each statement, the participant had to indicate their level of agreement on a Likert scale of belt points, starting from "strongly disagree' to "strongly agree'. The statements were prepared in order to maintain the correspondence between higher levels of agreement with characteristics that attest to the quality of the items evaluated. Thus, higher means represent higher levels of agreement, and the opposite is also true.

Results
Process testing resulted in some quantitative data. First, we can highlight the average number of errors identified by the evaluator using only one card, which was equal to 3. The average time that each participant in the evaluator role used to go through the application in search of problems was 11 minutes and 21 seconds. The average number of problems encountered per card was 3.
If we extrapolate the average time used by the participants to evaluate the application using a card, for a complete set of 65 cards, an approximate value of 12 hours and 16 minutes is obtained. Performing the same extrapolation for the number of errors, a total average of 195 problems or improvements per evaluator could be identified. However, these are just generalizations, for these conclusions it would be necessary to carry out specific experiments and verify in practice the behavior of these values. However, it is a relevant indicator of the time demand required by the process and the results, quantified in numbers of problems identified as a result of the evaluation, in the current way.
The phenomenological analysis of the answers to the two subjective questions contained in the questionnaire resulted in the identification of 45 meaning units. These units of meaning were classified into three categories, classifying them according to the content of the report. Strengths, weaknesses, and neutrals were the three categories defined. The arrangement of the information in the card helped the understanding of the concepts 55% 30% 5% 10% 0% Categories have clear labels related to card set concepts 85% 15% 0% 0% 0% The process helped to identify problems contained in the interface 90% 5% 0% 5% 0% I can observe usefulness in the process 95% 5% 0% 0% 0% I can see relevance in the process of documenting the problems identified 75% 25% 0% 0% 0% The process made me see positive points in the analyzed interface 55% 30% 10% 5% 0% I found it easy to identify problems or improvements using the selected card 20% 50% 15% 15% 0% The card used brought a previously unknown perspective 25% 25% 10% 30% 10% The content present in the card used covered topics relevant to the evaluation of the interface 90% 5% 5% 0% 0% Caption: SA -Strongly Agree; A -I agree; N/A -Neither agree nor disagree; D -Disagree; SD -Strongly Disagree; The neutral group meaning units, understood some suggestions. These reported how the documentation could be improved if applied with the help of whiteboard tools, especially when the evaluation needed to take place in conjunction with other evaluators. Reports also considered it important to map which cards achieve better results, in order to understand which heuristics allow the evaluator to identify more points for improvement in the interface. Participants also concluded that this process is more complex and covers more concepts compared to other heuristics lists. One participant reported that he felt he did not have sufficient knowledge about the specific heuristic he selected at random and thus concluded that he did not evaluate well due to lack of prior knowledge.
The reports considered as weaknesses were primarily related to the ability of the process to become extensive, and how to perform it individually results in the absence of relevant discussions, while carried out with other professionals can make the process complex. There are also reports that suggest that the process is tedious to perform individually, tiring and laborious. It has also been reported that there is a sense of absence of objective responses with the completion of the assessment. About the questions present in the cards, there were two reports, one pointing out that they could be highlighted in topics and another participant suggested that they should have more questions since the questions help in understanding the concept present in the card.
The reports classified as strengths are diverse. There is repetition in the report that suggests that the process helps to build critical thinking in relation to the interface, which results in good applications and interfaces. There were also units of meaning that expressed that this validation step is indispensable and has the potential to be used continuously by teams that apply agile methodologies, on a daily basis and in refinements. Adjectives such as simple, practical, fast, and objective were used to describe the process itself. This corroborates the objectives of the applicability of the proposed process.
It has been suggested that the process is useful for professionals working with interface and experience design. As developers reported that it can be an auxiliary tool in the process of understanding how to code tasks. At the same time, the process was described as an enabler of reflections and deepening of learning about the concepts presented, being considered useful in classes and in the academic environment.
The process changed the way the participant observed the application and even in a consolidated application, it is possible to find problems and points for improvement when performing the evaluation. It was also reported that the cards were able to separate concepts that are usually complex to separate from a larger context, making it possible to focus on specific aspects at a time.
With the general considerations documented at the end of the process, it was possible to observe that in addition to identifying points of improvement and problems, the proposed process implied reflections on the positive points contained in the interface. Even the participant who did not find any problems reported several reflections on relevant points of the interface in his final remarks.
From such considerations, it was possible to perceive that the participants were faced with questions contained in the cards that asked about the target audience and objectives of the product itself. As the participants are not part of the application or interface development, such questions are difficult to answer, making the evaluation in these moments more complex and distant from reality.
The objective section of the questionnaire was based on an agreement scale, with four statements related to the materials presented (application and documentation sheet), four to analyze the process itself and three to understand aspects related specifically to the heuristics arranged in the form of cards. Through the values assigned to each level of agreement.
The statement "the card used brought a previously unknown perspective", referring to the analysis of heuristics in card format, has the lowest average of the analysis. This data may represent that the participants, for the most part, were familiar with the concepts contained in the card used in the evaluation. On the one hand, this result may represent that the sample has mastery of the topics covered in the process, however, this statement cannot be made, since this phenomenon may be the result of the random choice of cards. In this case, 30% of the participants answered they disagreed with the statement, while 25% strongly agreed.
A similar scenario occurs with the statement "I found it easy to identify problems or improvements using the selected card". of agreement with the set, in which 15% of the participants disagreed with the statement. In isolation, it is a statement that says little, but when observing other statements it is possible to make parallels. Including the observation of the statement's metrics "the process helped in the identification of problems contained in the interface", which reaches an average agreement of 1.8 and which has 90% of the participants totally agreeing, it can be concluded that, although it is not easy to identify the problems, the process as a whole can help professionals. The complexity of identifying improvements and problems can be inherent to the assessment itself.
Regarding the content and structure of the cards, the statements "the information contained in the cards was easy to understand" and "the arrangement of the information in the card helped the understanding of the concepts" indicate that there is acceptance in relation to the format used to understand information about heuristics. However, 10% of the participants did not agree with the statement that relates to the arrangement of information in the card. This shows that there is still room for adjustments in the structure of the cards and for analysis of this item in particular.
As its average is the highest in relation to the others, the statement "I can observe usefulness in the process" stands out. For this statement, the participants only answered that they agree or totally agree, in percentage, are respectively 5% and 95%. In this way, it can be concluded that although there are reports that suggest adjustments and comment on the complexity of the process, its usefulness is perceived.

Conclusions and Future work
The results suggest that experts using the proposed process are able to identify problems; that the process itself is seen as useful in professional and educational settings; and the structure used to present the heuristics is valid and helps to understand the concepts to be evaluated. However, it is understood that more tests are necessary, in order to observe how the process behaves in the other proposed assessment flows, and in analyses carried out with several professionals simultaneously. Observing the average time obtained for analysis in the tests performed, it is possible to identify that the process tends to be extensive, which impacts its applicability in agile contexts.
It is also necessary to specifically evaluate the performance of each heuristic, both in terms of its ability to make the concept more easily understandable and in terms of its ability to identify problems. There is also a need to develop more questions for each of the cards, as suggested by the test takers. There is even the possibility of using visual resources to support the exemplification of the concept.
Printed cards have physical limitations, such as limited space for information. In this way, the developed application becomes a strong ally, which can evolve to gather more information about each concept. It is possible to plan application sessions to gather the analyzed articles in the systematic mapping, associate images, and reference links to each card to help the specialist, and augmented reality features that can work together with physical maps.
It is proposed that in the future the process is subjected to tests that include the evaluation and redesign of the evaluated interface. In this way, it would be possible to observe how the conclusions and reflections awakened in the specialists throughout the evaluation process can impact the redesign of interfaces. Finally, it would allow the comparison between the evaluated and the redesigned interface, helping to understand the improvements caused by the application of the proposed process. Likewise, experiments to refine heuristics and card content are also considered as future work.