Keywords

1 Introduction

Usability has become one main discipline on the design and development of digital multimodal artifacts. Regardless of this, in the last 10 years new interaction paradigms emerged and became status quo, representing a great challenge for both design practitioners and researchers in regarding of that matter.

Following on this, several studies [1–3] were conducted to verify if traditional usability evaluation is able to produce satisfactory results in the above-mentioned new context of multimodal digital interactions; those that bring to the table, besides others, touching and speaking interactions.

In [1], a framework based on the principles of experience (subjectivity, temporality and situatedness) was build to elicit real-time, real-context and direct user feedback. Through an experiment with casual gamers, it was possible to validate hypothesis on how, when and where those should be assessed.

It becomes clear that users should be elicited directly, in real-time and real-context of use. In order to do that, measurement scales were presented, remotely, during the actual use of a casual web game. Results elevated new parameters for those who work with multimodal digital artifact evaluation of any kind.

Reference [2] adds a critical analysis of traditional usability heuristics. The work presents a comparison of them and compiles a new set, specifically for digital multimodal artifacts. More specifically, experiments were conducted in the context of medical devices with digital composition.

In this study, the collection of heuristics that was build as result of the previously mentioned analysis also achieved better results, considering the number of violations identified, and the severity rating, in comparison with Nielsen’s [4]. Both inexperienced and specialist evaluators were able to conduct evaluation.

Last but not least, [3] shows a different perspective on the same issue. It presents newer paradigms that are being employed in the industry, to see what fits the best in the usability testing of contemporary artifacts. Positive results, comparing to traditional usability, were achieved in two experiments, with individuals using mobile devices and digital platforms.

Both procedures suggests that the group working with the collection of multimodal heuristics build in the process of research and in comparison with traditional ones, achieved a better performance in the evaluations. They were able to identify a considerably larger number of problems, with higher severity ratings, and pointed out more enhancement opportunities.

This paper aims to present, analyze and correlate results of these three studies, pointing out common issues, relevant advances and future research and pragmatical opportunities. The aim resides mostly in the need of contribution to update, academically and pragmatically, usability testing discipline and heuristic evaluation frameworks.

It is believed that combining the how, when and where principles, with collection of traditional heuristics remodeled to attend digital multimodal artifacts and the compendium of contemporary ones, based on industry guidelines, improvements can lead to more accurate evaluations.

2 Principles of Experience in the Evaluation of Digital Artifacts

In the design process, evaluating is a phase of major impact on the pursuit of delivering an effective and efficient product. Is well established that it is not possible to accurately evaluate it without consider users’ perception. But to capture it is difficult; many concerns have to be addressed in order to make valid user assessment.

When dealing with people, the importance of approaching right, without manipulating (conscious or not) feedback, to extract accurate data on their experience, is consensual; presenting potential responses can enable assess their state of mind even in very quick interactions. So dealing with this correctly is really important.

However, there is a lot more to the data collection. A whole branch of scientific thinking named knowledge elicitation deals with the subtleness of extracting reliable data from the users, as well as the reliability of the data collected, the questions to be asked, the way to ask and the resources used throughout the assessment.

A lot has to be acknowledged about the best way to assess subjective data, such as personal experiences, in a way that it maintains its validity. This is the issue research aims to address. Evidences [1] were achieved that if principles of experience can help achieve that, no matter the assessment model, method, technique or tool that are being employed.

Basically, user experience has three main principles [5–8].

  • Every experience is subjective. Observers (researchers) can interpret it from their backgrounds. The best way of assessing it, then, is from the user himself;

  • Every experience is temporary. That means when it is over, it cannot be replicated. Each occurrence is different from the other. It is not good to rely on users’ memory;

  • Every experience is situated. In other words, is not possible to experience it outside (in a lab, for example) the natural context of use.

A framework called TR2UE (tracking real-time, real context, user experience) were, then, designed in order to incorporate this during the process of elicitation.

To test it, users were challenged to play a game and during the experience, when they became inactive, a pop up with the sentence “How do you feel?” and a Likert 5-Point or Pictorial Scale (A/B testing [9]) appeared with a sequence of numbers or drawings representing potential subjective states. They give feedback choosing one it (Figs. 1 and 2).

Fig. 1.
figure 1figure 1

Screenshots of the web game running against time.

Fig. 2.
figure 2figure 2

Courtesy of Joy Street and Jynx playware.

213 users participated in the experiment. Their opinion was accessed directly, in real-time and in natural context of use, while they were playing. That interfered less on the experience, as results pointed out, and generated very rich and not manipulated feedbacks, compared with previous lab and specialist assessments.

Literature already recognizes the subjectivity factor in eliciting users, but “situatedness and temporality as two other aspect of the user experience are mostly neglected” [10]. It become clear that any theoretical or methodological construction that seeks to assess users’ should consider principles of experience in order to enhance quality of data collection.

3 Traditional Heuristics Compilation for Medical Devices with Digital Composition

As supra-mentioned, in [2] are presented evidences of heuristics that can generate better results in finding and, of course, helping solve, issues related to the use of medical devices with digital composition (and also other digital multimodal artifacts). The failure of this kind of equipment can lead to loss of human life. Because of that, they becoming more and more state of art in terms of digital components.

According to International Eletrotechnical Commision (IEC), a medical device can be described as any instrument, machine, application, software, calibrator and similars, that is fabricated, once or many times, separately or in combination, to one or more of the following medical purposes, but not limited to it:

  • Diagnoses, prevention, monitoring, treatment or relief of deceases;

  • Diagnoses, prevention, monitoring, treatment, relief or compensation of a damage;

  • Investigation, replacement, modification or support for anatomy or physiologic processes;

  • Information provision to by examination of derivated parts of human body for medical purposes;

  • Support and maintenance of life;

  • Birth control.

A great part of them have digital complex components. They became part of daily activities throughout the world. You can find it in Intensive Care Units (ICUs), helping take care of patients 24/7, for example, but also in a home, administrating medicine in the right amount and giving statistics in case of severe deceases.

Considering that the use of complex digital components is a reality when it comes to welfare of patients, possibilities of use are almost infinite; but also, of course, the demand of specialization and work for those who operate it increases a lot [11]. In general, it is evenly proportional the amount of new possibilities and errors [12].

It is already common sense that one of the main reasons of problems during medical procedures is human failure. According to Leape [13], 69 % of injuries to health happen for that cause. As [14], we believe that a poor design can make it even worse for operators and more suitable to accidents.

That kind of human-computer interaction and the cognitive distress associated to the use of medical devices with digital composition can definitely be evaluated through heuristic usability. So we wanted to know if a combination of contemporary heuristics would generate better results, in comparison with Nielsen’s.

To make that happen, Principles of Interactive Design [15], 8 Rules of Gold to Design Interfaces [16], Design Principles [17] and Nielsen’s were analyzed and, as result, a set of 10 heuristics were combined, in order to better evaluate medical devices with digital composition. The results are heuristics listed below.

  • Software-user system-software;

  • Learning ability;

  • Facilitating the cognition;

  • User control and system flexibility;

  • The system and the real world;

  • Graphic design;

  • Navigation and output;

  • Consistency and standards;

  • Error Management;

  • Help and Documentation.

Heuristic evaluation were conducted and results suggests that the combined heuristics were much more effective in identifying problems, pointing out 64 issues to be addressed, with severity rating of 1,75. Compared to Nielsen’s, which enabled evaluators identify 36 problems with severity rating of 1,47.

As a conclusion, the combined heuristics represent a starting point to digital artifact evaluation. Besides the fact they presented better results, combination is flexible and can be worked on infinitely, by specialists. The next step must be evaluating different kinds of artifact, in different contexts of use.

4 Industry Design and Development Guidelines; a New Set of Heuristics for Digital Multimodal Artifacts

Nielsen’s heuristics gained notoriety over the past decades and have become one of the main tool for the evaluation of digital artifacts. These heuristics were formulated in the early 1990s, as a result of his research on websites and systems for desktop computers. And as we know, a lot has changed since then.

The main issue is that the interaction paradigm of desktop computers, known as WIMP (Windows, Icons, Menus and Pointers), has very different characteristics, when compared with the new emerging computing paradigms, such as multimodal interactions, based on multi-touch and speech, made popular by mobile devices and, more specifically, tablets.

In that sense, the multimodal interactions are a major challenge for traditional usability methods, which may not be able to consider their differences and particularities in evaluation. This problem becomes highly relevant as such emerging paradigms are becoming increasingly part of our daily activities. Usability can present itself fragile before that context.

As much as there are efforts to apply traditional usability literature to emerging devices, which does not mean that the conventions are appropriate to what is presented. It is more likely that designers apply ad hoc methods or sets of empirical techniques based on human factors, which often show as inefficient and outdated [18].

Our research [3] aimed to verify if a traditional usability inspection method show adequate results when faced with digital multimodal artifacts. We also wish to contribute to a new usability, by compiling guidelines of design and development, established in industry, to be use as alternative in the evaluation of these kind of devices, as seen below.

  • Visibility and feedback;

  • Compatibility;

  • Control and freedom;

  • Consistency;

  • Error prevention;

  • Minimum actions;

  • Flexibility of use;

  • Organized content;

  • Error management;

  • Direct manipulation;

  • Change orientation;

  • Human reach.

Two experiments were conducted with two groups, separately. The first, made up of professional designers with greater experience and familiarity with concepts of usability. In the second, they were all science computer professionals with more shallow knowledge of usability and little or no experience in evaluations.

The procedure consisted of presentation of the different heuristics set to their groups, followed by the presentation of scenarios of use. Thereafter, each expert made his evaluation individually. They classified each problem according to the heuristic and then attributed the degree of severity based on the frequency, duration and impact of problem.

Both experiments suggest that the group working with the compilation of multimodal heuristics were able to perform better in heuristic evaluation. The use of heuristics better suited to multimodal interactive paradigm obtained a better performance in to the number of pointed problems and the degree of severity assigned.

The evaluations using multimodal heuristics were able to identify 92 issues with severity rating of 2,55, in comparison with 45 issues and severity rating of 1,53 in the Nielsen’s based ones. In the second experiment, with less familiarized evaluators, 39 issues with severity rating of 2,44 pointed out, in caparison with 26 issues and severity rating of 2,29.

5 Conclusions

As we presented throughout the whole paper, eliciting user information has became a major concern in the evaluation of digital multimodal artifacts. It is not difficult to understand why; established paradigms of evaluation, and mainly usability testing, have acknowledged that a lot has changed in regard of human-computer interaction; new paradigms are established.

Three studies explored possibilities on how would it be possible to deal with contemporary interactions without losing advances achieved in the past [1–3]. The major correlation of them lies on doing this in a way that make sense not only for designers, but also for researchers and, of course, and more importantly, users.

Reference [1] elevates the importance of considering subjectivity, temporality and situatedness, the three principles of user experience, while eliciting knowledge. Results shown that the less you affect the actual use of a multimodal digital artifact, the less users will rationalize assessment, and as result more accurate data will be collected.

In parallel, [2] presents a compilation of traditional heuristics; from widely spread used ones, both in industry, and validated results. The effort is to clarify what rules are, and what are not, adequate in the context of contemporary, multimodal interactions, without losing all the advances registered until now.

Also, industry are always moving forward and bringing new techniques, methods, methodologies and models to the table. Reference [3] have put together those updated paradigms, 100 % aligned to what’s state of art, creating a new set of heuristics based on the published documentation of leading organizations and recognized researchers.

After all, the opportunity of advancing resides, mostly, in combining those three constructs, as a framework to assess user feedback. That will enhance quality of data collection, maintain rigor of traditional usability testing and put another layer of contemporary validation. Results on 3-step usability testing will be shared as soon as new experiments come to end.