Designing human-centric software artifacts with future users: a case study

The quality and quantity of participation supplied by human beings during the different phases of the design and development of a software artifact are central to studies in human-centered computing. With this paper, we have investigated on what kind of experienced people should be engaged to design a new computational artifact, when a participatory approach is adopted. We compared two approaches: the former including only future users (i.e., novices) in the design process, and the latter enlarging the community to expert users. We experimented with the design of a large software artifact, in use at the University of Bologna, engaging almost 1500 users. Statistical methodologies were employed to validate our findings. Our analysis has provided mounting evidence that expert users have contributed to the design of the artifact only by a small amount. Instead, most of the innovative initiatives have come from future users, thus surpassing some traditional limitations that tend to exclude future users from this kind of processes. We here challenge the traditional opinion that expert users provide typically a more reliable contribution in a participatory software design process, demonstrating instead that future users would be often better suited. Along this line of sense, this is the first paper, in the field of human-centric computing, that discusses the relevant question to offer to future users a larger design space, intended as a higher level of freedom given in a software design situation, demarcated by precise design constraints. In this sense, the outcome has been positive.

In particular, in the last decades, the role of users has become increasingly relevant in the participative software design processes, with the use of specific methodologies, such as: (i) User-Centered Design (UCD) [4][5][6], and (ii) co-design [7][8][9], for example. These methods differ in how users are involved in the design process; the main difference being that the UCD methodology takes into consideration users' needs under the form of requirements to be passed down to designers and developers [10,11], while, at the other end of the spectrum, in a co-design process users play an active role in all the software project phases, even initially, by proposing ideas and providing suggestions under different forms [2,12].
Although these participatory methodologies are widely investigated in the literature, as well as intensively used in several realistic contexts [13,14], what remains a key issue is how to individuate those groups that better represent the communities of users who can provide the best contribution, during the design process of a computational artifact, [15].
Drawing on these considerations, in this paper we go a step further along this line of research, trying to find an answer to the crucial question posed by Whittle [16]: what kind of users' participation is needed for a successful participative software design process?
In our work, we push forward this kind of investigation by challenging the idea of engaging future users (i.e., the novices) as the users that could take part in the design process of a computational artifact.
In particular, we carried out a special experiment where, beyond future users, the team of collaborators was also extended to include a further group of expert users (thereby intending those users who had already experienced similar kinds of software applications). The motivation behind this choice rests upon the intuition that engaging expert users could improve the final software product, by virtue of the knowledge that they have already acquired in similar experiences.
The aim of this experiment was that of contrasting the two kinds of contributions coming from future and expert users, respectively; each group of human beings having different expertise in the field of interest.
To anticipate here some of the results, we surprisingly maintain that the intuition mentioned above is questionable, under different viewpoints.
Only to cite the most relevant one, our analysis reveals that the increment in terms of the innovation supplied by experts is moderate, as contrasted against the novel intuitions proposed by future users. We here essentially challenge the traditional sentiment that only expert users are functional to productive participatory software design processes, demonstrating, instead, that the lack of bias in future users could be better suited to generate effective innovation. Thus, in the specific field of human-centric computing, we maintain that this is the first paper that discusses on the importance to offer to future users a larger design space, intended as a higher level of freedom given in a software design situation [17][18][19].
Specifically, our analysis was based on the study of a case of a mobile software artifact, to use for high-school students who have to choose a University degree program. As already said, two different groups of users were involved in the design process of the software application. The first group of future users was comprised of 200 high school students. The second group of expert users was comprised of 59 students, already attending courses at the University. Finally, a committee of 20 different evaluators, including faculty and staff members, have assessed the contributions provided by both future and expert users. After this assessment, the software application was developed based on the variants approved by the committee, and finally tested by circa 1500 students from high schools (who had not taken part in the first stage of this process), with the (not) surprising result mentioned above.
The remainder of this paper is organized as follows. In the next section, we clarify the general context at the basis of our study. In "Related work" section, we extend our discussion on other works related to our research, while in "How the design process evolved" section, we summarize the process to which: (i) future users, (ii) expert users, and, finally, (iii) the committee were subjected. In "Results and discussion" section, measurements and statistics are provided that emphasize the results of our study. Finally, "Conclusions and future work" section concludes the paper with a brief discussion.

Research issues and context
We begin this Section by first introducing the concept/definition of who expert/future users are. Then, we will clarify what a function is, within the terms of this present paper. Finally, we will present our research question, along with the scientific process through which we have tried to give an answer to the question mentioned above.
Let us begin with the issue about expert/future users, as emerging from the specialized literature [20][21][22]. In the general context of human-computer interaction, three are the main types of users: 1. Experts (i.e., frequent users); 2. Novices (i.e., future users); and, finally, 3. Intermittent users. Apart from Intermittent users, whose role is often rather marginal, with the term experts, applied to users of a given software artifact, it is intended that these users have had past experiences with those artifacts (or similar ones), and then require a brief feedback on a new software of the same type, while interacting with it. In essence, the value brought by experts amounts to the experience they have got in the past. On the contrary, novices are those users whose past experience (if any), with the same kind of software artifacts, have been limited in time and in intensity; this being the cause for their slow progress in understanding the special peculiarities of that software artifact.
To come to our research, expert users can be identified as those University students who have already interacted with the didactic/research functions offered by that University. Novices (i.e., future users), instead, are those freshmen who still have to begin their studies at the University.
Needless to say, in contexts different from the one we are investigating, the identification of experts vs novices may require further adequate reflections. Now, it is the turn of explaining what the term function means, within this paper. In essence, with this term function we refer to the assemblage of all the various information that students search to make a decision, before they choose a specific degree to attend at the University. Oversimplifying it, the issue, here, has been that of deciding what information to make available, and the correspondent software tools to be used to facilitate this task. More precisely, the paper discusses on the relevance of the different types of suggestions that different users have provided, based on their experience. Along this line of sense, when we talk about functions, we are not talking about engineering/software requirements, rather about those specific types of information that students need to make a choice. Maybe, in this sense, we could rather talk about users' requirements. Categories, finally, are simply aggregations of groups of functions.
In this research, we focused our attention on the following research question: • Are either experts or novices or both well qualified to provide a contribution to the design of a human-centric software artifact?
To answer to such a question, we focused on the problem of how a software application should be designed for the use of high school students, who need to choose a University degree. We architected a process consisting of the following three phases (also depicted in Fig. 1): • Phase 1. More than 200 high school students (novices) were asked to propose functions to be incorporated into the software application to be developed. • Phase 2. The same question was asked to 59 students already attending courses at the University of Bologna (experts). • Phase 3. A committee of 20 faculty and staff members were asked to check all the functions proposed in the previous phases (1 and 2), and to select the final ones to be included in the final software artifact.
This software application was intended to be of help for high school students, while choosing the University program that best suits their skills and interests. This emerges as a significant problem in a huge University, like the University of Bologna, Fig. 1 The design process for example, with 11 Schools, 33 Departments, 219 different degrees and more than 85,000 students [23].
As previously mentioned, our first decision was that of choosing future users as the main collaborators in the process of designing the application of interest. In particular, last-year high school students were engaged, who were going to decide about their academic future in a few months.
Then, with the doubt whether such a team of collaborators include subjects who were too young to express reliable opinions [24], we enlarged such an audience to include also expert users, that is, students already attending courses at the University.

Related work
Deciding those who can take part in a participatory design process is an open issue, especially if the subject of the design is a human-centric software artifact. We can start a succinct review on this crucial subject by reminding that general concerns are expressed, in the specialized literature, regarding the limitations emerging when users play an active role in the design of a software artifact, from a variety of viewpoints, including: conceptual, ethical, and pragmatic points of views [25].
Among those concerns, quite convincing is the fact that predetermining who participates has the negative impact to limit the potential for the design, as discussed, at length, for example in; [26,27]. Also of great pragmatic relevance is the concern deriving from the type of collaborators who can be engaged in this kind of processes, from a social/ ethical viewpoint. For example, [28] propose to engage socially disadvantaged citizens in participatory design activities, as a means to "uncover hyper-local concerns" that only such special collaborators are able to recognize, even if it is often required a special care to manage such a type of collaborators. Along this same line of sense are to be interpreted also those works by [29][30][31].
Besides these considerations, more relevant to our study is the issue about the tension that is created between (future) users and a traditional team of developers and analysts of software artifacts. This tension requires specific attention. A wealth of research, in fact, has demonstrated that analysts and developers, typically, perceive themselves as the real domain experts. As a consequence, they tend to unconsciously devalue novices' contributions, thus failing to internalize new experiences into human-centered software artifacts [32][33][34].
This kind of problems further exacerbates if users are young people, like in our case study, since differences in age, culture and lifestyles are seldom considered as valuable resources, from the standpoint of a traditional expert. Nonetheless, specific attention is to be devoted to these specific cases, as engaging youngsters in participatory activities, to design a computational artifact, is recognized as extremely challenging [35].
To this aim, several scientists have already reported on the urgent need for further research for better individuating appropriate methods that involve young users during the phase of development of a computational artifact [36]. The main issue being that youngsters (or even adolescents) occupy a vulnerable state, poised midway, between childhood and adulthood. They hold this tenuous, hybrid state, both under law and custom. And as such, they are understudied, poorly understood, and weakly represented by the interaction design research community, particularly when issues of design actions and/or strategies for developing software artifacts of particular relevance are the subject of the discussion, [37,38].
With a full conscience of this kind of problems, some few interesting researches are worth to be mentioned, like for example those reported in [39][40][41][42]. Even though in some of them resurfaces as relevant the concept that, when a software artifact is codesigned for a public institution (for example, a school or a University) users should be engaged who cover different roles (for example, from student-to-be, to undergraduates, graduates, alumni and finally teachers), all these researches fail in providing a precise assessment of the different contributions that each category can supply. This is especially evident in research initiatives like those described in [43,44] where, in both cases, novices (i.e., future users) are engaged just to assess a software artifact, but without an active participation in the design process activities.
Unlike that, instead, our paper represents the first study, to the better of our knowledge, where an effort has been devoted to providing an analytical measurement of the importance of actively engaging (future) users in the initiative of co-designing a human-centered software artifact. Other preliminary approaches, along this same line of research, are present in the literature, yet limited in scope and in the analytical assessment, like those discussed in [45][46][47][48][49][50]. In particular, such preliminary studies differ from our investigations, since they are mainly devoted to evaluating how freshmen and undergraduate students exploit smart devices and mobile apps, while conducting their learning and University activities. It is worth mentioning that none of these studies has involved students to reason around the design of the software of interest.

How the design process evolved
We provide, now, some relevant details on the kind of participation supplied by the two aforementioned groups of users who took part in the experiment we have developed (i.e., futures and experts).

Future users
Totally, 28 functions were proposed by future users, as shown in the second column of Table 1. All those functions have been split over 10 different categories, representing specific topics, as portrayed in the first column of Fig. 1.
Of interest in this process is the fact that the 200 students (future users) taking part in the experiment were divided over 45 groups. After a 10 min long introduction, each group was required to propose and illustrate possible functions to be developed, allowing them to draw on smartphone-type shaped papers, like those portrayed in Fig. 2. At the end of this phase, all groups were required to agree on a final set of functions. Those of Table 1, indeed.

Expert users
Concerned with the fact that the functions, suggested by 200 novices, could not be comprehensive of the complex problem we were analyzing, we brought into play 59 expert users.
They were students already attending some 40 different degrees at the University of Bologna.
This time, they were first asked to either confirm or reject the 28 functions suggested by the future users. Then, they were asked to provide new original contributions; specifically, further functions, besides those suggested by future users. The result of this second phase was, simply, that expert users accepted all the 28 functions suggested by future users, and then proposed further 10 new ones, specifically those portrayed in Table 2.
Of relevant, here, is our choice to conduct this experiment by showing to expert users all the functions suggested by future users, and then asking them to either accept or not those functions. One possible alternative would have been that of hiding those functions to expert users, asking them to propose the entire set of possible features to be developed. Our choice of making explicit to experts the functions suggested by future users stems from the consideration that typically expert users cannot be easily influenced. Moreover, the reader should not forget about the starting point of our research: We have here challenged the usual opinion that only expert users can provide a reliable contribution in a software design process based on a users' participation, while future users should be excluded due to their lack of experience.
At the end, this intuition behind our research has been confirmed by facts, since all the 28 functions suggested by future users were recognized as useful by experts, and worth to be included in the final software application.
In conclusion, 38 functions were proposed (28: future users; 10: expert users) at the end of this phase of the experiment.

The committee
A final phase of the experiment was devoted to check and filter out all the proposals produced by the first two groups. As already anticipated, this role was played by a committee comprised of both faculty and staff members of the University of Bologna. The result of their work is portrayed in Table 3, where the final 22 functions chosen by the committee for development are shown. In essence, the committee accepted: • 16 out of the 28 functions suggested by future users; • 4 out of the 10 functions suggested by expert users.
Further, they added 2 additional functions (marked in bold, in Table 3).

Users' satisfaction
To conclude this discussion, crucial is to remind that a mobile software application was actually developed including all the 22 aforementioned functions.
To assess the final users' satisfaction while using this software application, we asked approx. 1500 high school students to answer an online questionnaire, where a 5-values Likert scale was proposed to score each different category of Table 3 (1: completely not satisfied-5: completely satisfied).
We report the more relevant results of this experiment in Table 4, where, for the sake of conciseness, we further clustered the 10 categories of Table 3, into 6 more compact groups. As shown in Table 4, all the 6 groups of functions got satisfying results, on average. An amount of more than the 80% of the students, who were asked for an evaluation, provided a response to our questionnaire: hence the corresponding number of responses can be considered of a reliable statistical significance. Even more relevant was a spare question we asked all of them. The question was concerned with the fact whether the set of the provided functions had covered, according to their opinion, the whole set of possible University problems. Out of 1342 interviewees, 1282 responded positively, yielding a percentage of 95.53% interviewees confirming their satisfaction.
Further to this positive evaluation, we were interested in shedding a brighter light on these results. Hence, the closer analysis of which at "Results and discussion" section.

Results and discussion
To answer our initial research question and validate our intuition, we counted the amount of contributions each different group of users produced ("Amount and quality of contributions" section), and then measured analytically their relevance ("Similarity between groups" section). The subject of this final part of our study was the attempt to understand if the inclusion of expert users had either provided a contribution of relevance or not.

Amount and quality of contributions
To this aim, we counted how many and different contributions, out of the set of functions that were finally implemented, were provided by either the future or the expert users. This has been done with a simple enumeration of the instances of the proposed functions.
Consider the set of functions (F) (better would be to call them at least here features, not to abuse with term function) proposed by the future users, the set (E) of those proposed by expert users, the set (C) of those proposed by the committee, and, finally call X the union of F, E, and C. Consider the generic feature f i ∈ X and take, now, the following mapping, based on the well-known Indicator function of a subset D of the set X, I D : X → {0, 1} , defined as: Applying the aforementioned Indicator function, we yield the results reported in the three rightmost columns of Table 5.  To provide a clearer visual representation of those results, we have drawn Fig. 3. In that Figure, Fig. 4, where the contribution provided by future users emerges as predominant (see Phase 3 in Fig. 4).
The same information of the third histogram of Fig. 4 is further portrayed, under an alternative representation, in Fig. 5 for a more impressive communication of the results.
All this said, while it is crystal clear the relevance of the contribution provided by future users (from a quantitative viewpoint, obviously), more intriguing is responding to the question whether experts have played a relevant role within this process. We will discuss it in the next "Similarity between groups" section.

Similarity between groups
We here resort to a more sophisticated statistical analysis to reason about the similarity of the different groups of features (i.e., functions), as provided by the correspondent contributors (futures and experts).
In essence, to respond to the question posed before, we tried to look at this problem from a statistical perspective, checking for and working on the similarity of the clusters of the provided features (i.e., functions).
To this aim, consider now the three following codomains: Now, considering the three following statistical techniques, aimed at defining similarity among different sets: • The Cosine similarity: is a measure of similarity between two non-zero real-valued sets of an inner product space that measures the cosine of the angle between them (range: 0-1). • The Jaccard similarity coefficient: is a statistical index used for comparing the similarity (and diversity) of sample sets of binary values. It measures the similarity between finite sample sets and is defined as the size of the intersection divided by the size of the union of the sample sets. It only considers the total number of attributes where both the sets have a value of 1 (range: 0-1). • The Simple matching coefficient: is a statistical index used for comparing the similarity and diversity of sample sets. It also considers the total number of values where both the sets have a value of 0, as well as 1 (range: 0-1).
That said, we applied the three methods above to our sets: A, B, and C, with the final result shown in Table 6.
On the basis of the statistical theories behind the similarity techniques we exploited, we can maintain that a good level of similarity is reached between two different groups if the threshold of 0.5 is surpassed.
Along this line of reasoning, Table 6 seems to suggest that the contribution provided by future users and expert users is to be considered similar. Yet, we have here to remind Much less similarity may be found, instead, if we compare the contribution provided by the committee with that of the other two groups. This confirms an additional, as well as well expected, result: the main role of the committee was just limited to that of filtering out some of the proposals coming from users (futures and experts). This final result is clearly confirmed if we count the number of functions proposed by future users and then rejected by the committee: exactly 12 out of 28 (yielding almost 43% of the total amount). Not only, also the amount of the functions proposed by experts and then discarded by the committee amounts to 6 out of 10 functions (exactly 60%). In these terms, the committee has represented a crucial point where innovation has gone, in some sense, lost. Conversely, at the other endpoint of the spectrum, while it is confirmed that future users were the real innovation accelerators in this process with their many and different suggestions, the role of expert users appear questionable.
On one side, in fact, they have not decelerated the innovation brought by future users, as no one of the functions proposed by futures has been rejected by them. On the other side, nonetheless, if we contrast the number of the functions provided by expert users [23] against the total number of the suggested functions [44], we can observe that their contribution in terms of novel ideas (25%) is moderate (not to say marginal).

Conclusions and future work
Our study stems from the necessity to overcome some traditional participatory design limitations in the process of software design. Many have thought that future users of a given software application are not the perfect team of collaborators, based on the motivation that they do not have a complete knowledge of that specific domain, not being already current users.
We developed a complex experiment to challenge such a statement. In particular, we enlarged the target audience of users of a given application by including also expert users, as well as a final committee of evaluators. Our results witness a (not) surprising fact: most of the innovation in terms of new proposals has come from future users. The committee just played the role of moderator, while experts provided only a partial contribution, in terms of innovation (ideas and new proposals).
In some sense, we have extensively questioned the traditional opinion that only expert users can profitably contribute to a participatory software design process, demonstrating instead that future users would be often better suited.
Also interesting is the fact that the software application that was subjected to the aforementioned three-phases long process got a satisfaction score of approx. 78% by its final users (approx. 1500 high school students). Based on these further interesting results, questions remain open about the fact whether that satisfaction value (78%) would in/decrease, with a different kind of intervention by means of experts, along the software design process we conducted.
Abbreviations HCC: Human-centered computing; UCD: User-Centered Design; IT: Information technologies; F: Future users; E: Expert users; C: Committee.