DMPonline Version 4.0: User-Led Innovation

DMPonline is a web-based tool to help researchers and research support staff produce data management and sharing plans. Between October and December 2012, we examined DMPonline in unprecedented detail. The results of this evaluation led to some major changes. We have shortened the DCC Checklist for a Data Management Plan and revised how this is used in the tool. We have also amended the data model for DMPonline, improved workflows and redesigned the user interface. This paper reports on the evaluation, outlining the methods used, the results gathered and how they have been acted upon. We conducted usability testing on v.3 of DMPonline and the v.4 beta prior to release. The results from these two rounds of usability testing are compared to validate the changes made. We also put forward future plans for a more iterative development approach and greater community input.


Introduction
Data management planning is a topic of increasing importance to researchers, their host institutions and research funders alike.Many research funders require a Data Management Plan (DMP) as part of grant proposals, and several UK universities have also mandated them.The Digital Curation Centre (DCC) has done significant work in this area for some years.We initially compared funders' data policies, examining requirements for DMPs and using this analysis to author a Checklist for a Data Management Plan1 .We also produced the first online tool to assist in the data management planning process -DMPonline2 .
DMPonline is a web-based tool to help researchers and research support staff produce data management and sharing plans.It was first demonstrated at the Jisc conference in London in 2010 and has since undergone regular updates and some major redevelopment.The tool has received international recognition, and we were very pleased when it was shortlisted along with the work of US colleagues on DMPTool3 for the DPC's Digital Preservation Awards4 at the end of 2012.
Despite this success, we knew there was some room for improvement.We undertook an in-depth evaluation of DMPonline in Autumn 2012, eliciting feedback from existing users and people who had not experienced the tool before.A great deal of positive feedback came out of this process, but there were also some clear messages for change.This paper reports on that evaluation and how we have revised the tool.The latest version (v.4) was released in December 2013.

Evaluation
As part of the ongoing DCC tools strategy, we decided to evaluate our most popular Research Data Management (RDM) tools.DMPonline had received high levels of interest and had over 2,000 users.The usage statistics did not indicate the depth of engagement or user satisfaction, so we wanted to employ other methods to examine these in more detail.At the same time the DCC was undertaking an Institutional Engagement at the University of Edinburgh.The Edinburgh RDM Steering Group asked us to test DMPonline with researchers to ensure the tool was fit-for-purpose before wider rollout across the institution.These two programmes of work were combined.
Between October and December 2012 we examined DMPonline in unprecedented detail.Users had periodically emailed comments and requests for new features, and we were receiving increasingly detailed feedback via the Jisc Managing Research Data (MRD) programmes (e.g.Shotton, 2012;Cope, 2012;Proudfoot, 2012).While much of this feedback was positive, there were also some serious criticisms that caused concern.The main issues raised were the number of questions being asked in DMPonline and the length and format of the output.It was felt that the tool was overly detailed and complicated, which may be off-putting to researchers.doi:10.2218/ijdc.v9i1.312Getler et al. | 195

Evaluation Methods
We employed a variety of methods to explore these views in more detail.The unsolicited feedback we had received provided a starting point from which to frame other examinations.Initially, we sent a survey to existing DMPonline users at the University of Edinburgh and conducted two focus groups.Although the feedback from focus groups can be contentious, as memories are imperfect and users do not always tell the whole truth, they helped to uncover further issues that could be investigated more objectively through guided interviews, usability tests and heuristic evaluations.The aim was to conduct a holistic study using a wide variety of research methods in order to combine multiple data sources and generate powerful insights about the tool.
Data was gathered from the following sources and methods:  Tool analytics to interpret the level of uptake,  Unsolicited feedback from users via emails and blog posts,  A survey and feedback from DMPonline users at Edinburgh,  Focus groups to gather users' perspectives on the tool,  Guided interviews to walk through the experience of using the tool,  Heuristic evaluations to examine the usability of the interface,  Usability testing with researchers who had not yet encountered the tool.
Across these methods we have engaged with or received feedback from circa 50 individuals, representing researchers and research support staff.Emphasis was placed on feedback from researchers, since they are the primary audience for the tool.In particular, their responses can be seen in the guided interviews, usability tests and much of the feedback from emails and blogs.

Tool analytics
A number of different metrics are automatically collected in DMPonline.These include the number of users and plans, the number of plans by template and the number of templates used in each plan.We also reviewed the number of institutional customisations and the uptake of these templates.By examining the percentage completion of plans and user return rates, we hoped to get a sense of engagement and separate out preliminary testing and exploration of the tool.

Blogs and email comments
Various users, particularly from the MRD community, have provided detailed feedback on the tool.Sometimes comments have been made public on blogs and project websites, while others have emailed their suggestions directly to us.Responses have typically been received from research support staff, though in many cases they were passing on the findings of local testing with researchers.

Survey results and discussions with Edinburgh DMPonline users
To begin the Edinburgh DMP pilot we identified 27 people5 who had registered for DMPonline using an '@ed.ac.uk' email domain.Around half were research support staff and half were researchers.We emailed them five basic questions to elicit general doi:10.2218/ijdc.v9i1.312feedback on the tool.The questions covered why they registered, their experience of using the tool, and what additional support they would like to see.We followed up the survey by inviting people to take part in a focus group or guided interview.The Chair of the Edinburgh RDM Steering Group has also provided extensive feedback on the tool throughout the redevelopment process.

Focus groups
A focus group is a technique to explore what individuals believe or feel, as well as why they behave in the way they do.It involves the use of group interviews in which participants are selected because they meet a certain characteristic of a specific population, 'focused' on a given topic: in our case, the group was made up of research support staff.Participants were selected on the basis that they had used the tool, would have something to say about it, and would be comfortable talking to the interviewer and each other.We ran two focus groups.The first one took place at the University of Edinburgh, with the second following soon after in Nottingham during the Jisc MRD Progress Meeting.

Guided interviews
At the core of this method is a list of 105 adjectives that describe the system (Travis, 2009).These range from positive assertions such as 'simple', 'clean' and 'powerful' to more negative associations, such as 'hard to use', 'confusing' and 'frustrating'.Interviewees are asked to select as many as they like that apply to the interface.The interviewer then asks the participant to circle five adjectives from those chosen, and these adjectives become the basis of the guided interview.During the interview the participant discusses the reasons for his or her choice of words.

Heuristic evaluations
A heuristic evaluation, also called expert analysis, is a method in which an evaluator acts as a user of the tool and attempts to complete a set of tasks.For DMPonline the tasks were: The participants evaluated the interface against a set of principles known as Jakob Nielsen's 10 Usability Heuristics (Nielsen, 1995).Examples of the ten heuristics in user interface design were also given to evaluators.
A heuristic evaluation can provide valuable feedback and identify obvious usability problems before usability testing is carried out with representative users.Based on research in Human-Computer Interaction, it is expected that five 'novice' evaluators ('novices' with respect to usability but not to data management planning) find on average about half of the usability problems (Nielsen, 1992).
The heuristic evaluations were completed by DCC staff.Although not usability specialists, DCC staff are authorities in research data management and we thought it was important to include their feedback.Six reports were received and analysed.doi:10.2218/ijdc.v9i1.312

Usability testing
A usability test is a technique used to evaluate a product.Typically the test is conducted with a group of potential users, in our case seven researchers.Users were set a number of common tasks based on the functionality of the tool, including registering for the tool, creating, sharing and exporting a plan.Participants were asked to rate the expected difficulty of each task prior to completion and post-task.At the end of each session the facilitator asked the participant to subjectively assess the usability of the tool using a standardised Software Usability Scale (SUS) questionnaire.
All sessions were run by two DCC staff (a facilitator and an observer).They noted success rates in completing the tasks, the time taken to complete and any errors made.Sessions were recorded and analysed to identify potential areas for improvement to the tool.A second round of usability testing was undertaken prior to releasing v.4 of the tool.The results section compares the findings from these two rounds.

Summary of Findings
A wealth of data was gathered during the evaluation.There was consistency in the results across the different methods, with similar issues and recommendations being repeated throughout.The major action to emerge was the need to review the DCC Checklist and its use within DMPonline.
We received a lot of positive feedback about DMPonline and strong support from the community.It is a well-regarded tool with international reach.People were aware of its progress and commented that it was a "real stimulus" that has improved over time.Several people noted that they prefer an online tool to Word documents and were glad that the DCC was delivering a service for the community.There was also a very strong demand for institutional customisations, and people wanted to be actively involved in the future of DMPonline via user groups and developers forums.
The sharing feature, which allows users to co-write plans with colleagues, was praised in particular.People liked this addition and felt it worked well.Researchers also found the ESRC template particularly useful, with one commenting that: 'I found DMPonline a very useful and neat tool for preparing a DMP for a submission to ESRC.The clearly-structured questions break down the (rather daunting) requirements of ESRC for DMP into bite-sized chunks.I found the 'DCC guidance' button for each question particularly valuable: it is useful to have some tips and suggestions for each question, and this also helped me to check that my answers were on the right track.' Comments were made that the tool was technically very strong and well-coded.The guidance was also praised, particularly the ability to customise this.
Unfortunately, we also uncovered a number of serious issues.Users had problems with the number of questions being asked and the level of detail they went into, consequently describing the tool as longwinded, off-putting and repetitive.Concerns were also raised about the process of mapping requirements to the Checklist.Mismatches meant that two substantially different questions could be asked and the nuances of disciplinary differences or funders' intentions were sometimes lost.There were also some issues with the export process, as plans failed to meet funders' formatting restrictions.
Several users were confused by the templates and did not know which to select.They reported being overwhelmed by "a forest of options" and "lots of boxes for ticking doi:10.2218/ijdc.v9i1.312with no explanation of what they were."Several researchers also requested seeing their funder's requirements in full before starting to answer questions, so they could grasp the extent of what was required, interpret this in context and consider options with coinvestigators before beginning to write their plan.
The most troubling finding was that some users thought the tool was too confusing and difficult to use, so reverted to other options instead: 'I tried to use the tool but found the templates were too long, too complicated and in the end used a colleague's data template.''Embarrassingly, I had no idea how it was supposed to work and in the end didn't use it at all.' Some users seem to have been confused about what the tool could offer and struggled to understand the conceptual framework on which it was based.A guided interview with a researcher at Edinburgh was particularly illuminating in this regard.After registering for the tool, he determined that it was going to be more complicated than it was worth and pursued an alternative means of putting together a DMP.His feedback explains how he reached that decision and what he had been expecting: '...I think my expectation was that there would be sort of, okay, this is an ESRC proposal, so these are the five things that need to be in it, and here's the kind of generic information that you need to include, and then here's where you need to say something about your project in particular.'It seems users did not always associate the questions they were seeing with their funder's requirements, and expected a clearer process with richer guidance and support.
The blank text boxes were found to be off-putting, as it was unclear what level of detail was needed, and users expressed a desire for more practical examples and suggested answers.The researcher being interviewed concluded that he did not think there was anything specifically wrong with the online tool -he commented that it was well-coded -but he thought the conceptual side needed some work.
After analysing the feedback we put forward a number of suggested actions.Those noted below were prioritised:

Responding to Users' Feedback
We embarked on a number of changes in light of the findings.We agreed to revise the Checklist and rethink how this was used in the tool, as well as redesigning the workflows and user interface.This was a large undertaking, so work was divided across a team of DCC staff with different skills and knowledge.We also outsourced some of the work to professional designers and developers.

Rethinking the Checklist for a DMP
The DCC Checklist presents the main questions or themes that researchers may want to consider when writing a DMP.Over the years it grew in response to suggested additions, ending up listing over 100 questions.To define a more manageable set of questions, we synthesised requirements from funders and institutions with best practice within the wider community, drawing out common themes.This exercises resulted in a list of 13 questions and some suggested administrative data.
The Checklist was central to versions 1-3 of DMPonline.Funder requirements were mapped to the closest question(s) from the Checklist.These chosen Checklist questions were then presented for users to answer, as seen in Figure 1.Similarly, guidance from institutions or disciplines could be associated with specific Checklist questions and would be displayed any time these questions were used in templates.
By comparison, in version 4 it is the system that does most of the work of selecting the questions and associated guidance, which are presented to the user for response.A set of themes is now used to associate guidance from other sources (e.g.institutions or disciplines) with the questions.In the example in Figure 2, you can see guidance from the University of Glasgow on 'metadata' and 'data formats' has been pulled through to assist researchers responding to questions from the Biotechnology and Biological Sciences Research Council (BBSRC).

Revising Workflows and User Interface
A huge amount of work was involved in redesigning DMPonline.We reviewed feedback from the evaluation to consider how the new version should function and look.Next, using UML (Unified Modelling Language), we fleshed out the requirements and defined a new data model.We created structure (class diagrams), behaviour (use cases diagrams) and interaction diagrams (sequence diagrams).Based on these diagrams a number of user roles were created, as well as wireframes that represented multiple scenarios.
A digital prototype was created using open source User Interface (UI) prototyping and interaction modelling software Indigo Studio6 .Several users were invited to comment on the prototype to test the initial design hypothesis and some minor changes were made based on their reactions and suggestions.Finally, leading UK design agency, Tayburn Ltd7 , was hired to produce a new look for the tool.Their remit was simple: starting from our wireframes, they had to produce an online application with a clean interface, easy to understand workflow and with a look a feel that reinforced the DCC brand.User feedback suggests that they have delivered.
The revised UI makes it easier for researchers to navigate the tool.The plan now begins with a summary of the questions they will be asked.This allows researchers to set individual questions in the broader context and get an overview of the entire process.The tabbed interface, zipped sections and features such as the progress bar, also helps users to visualise how one part fits into the broader whole.
Workflows have also been greatly improved and this solved one of the major problems encountered by users: choosing the appropriate template for a plan.A wizard function was introduced that asks users direct questions to ascertain which template is needed and what guidance should be displayed.Users are invited to select their funder if they are applying for a grant, and to select their institution and any other sources of guidance they wish to see.If no funder or institutional template is appropriate for the user, we present the DCC Checklist as a generic set of questions and guidance.

Comparing Usability in v.3 and v.4 of DMPonline
We undertook detailed usability testing of DMPonline v.3 in Autumn 2012, and repeated this process once the beta of v.4 had been released.The tasks differed slightly as there were some new features we wanted to test in v.4, namely the provision of tables in the NERC template, multiple phases of plans (pre-and post-award), and the ability for users to change their password.However, the main tasks remain the same across the two rounds of usability testing.These were to register for the tool, create and fill in a plan, share the plan and export it.These are the core, basic tasks that any user would carry out in the system.A full list of tasks from each round is in Appendix 1.
For each task we measured success by recording the following: 1. Completion rates: A binary measure of task success, where 1 = success and 0 = failure; 2. Task completion time: A note of how long each user spent on the activity; 3. Errors: A list of any unintended action, slip, mistake or omission.Where possible, these were mapped to usability problems; 4. Usability problems: A list of problems encountered by users with a description and severity rating as follows: the problem 'prevents task completion', 'causes a significant delay or frustration', 'has relatively minor effect on task performance', or 'is a suggestion'; 5. Expectation ratings: A pre-and post-test assessment to rank the difficulty of each task; 6. Satisfaction ratings: A subjective assessment given after the test, when users were asked to complete a standardised usability questionnaire.

Task Completion Rate
At the core of the usability tests were a series of 'critical' tasks that users were likely to attempt when using the online tool.The tasks represented key functions and features of the tool.In v.4, participants were largely able to complete all of them.All users successfully completed Task 1 (sign up for DMPonline), Task 2 (start a new plan), Task 3 (provide a brief answer to a question in the plan), Task 5 (share the plan), Task 6 (export the plan) and Task 7 (change your password).Five of the six users (83%) completed Task 8 (contact DCC) but only a third (33%) completed Task 4 (return to your plan and provide two examples of datasets).The biggest difficulty in Task 4 was doi:10.2218/ijdc.v9i1.312editing the table, in particular adding a new row.Users expected to be able to use the tab key to move between cells and add new rows by pressing the 'return' button.We have since revised the tables so they are presented more clearly and are easier to navigate and complete.
In spite of difficulties with Task 4, the results of the usability testing demonstrated that the new interface was far more user friendly.Task completion rate was significantly higher.In v.3, only two out of seven tasks (save your answers and update the plan) were successfully completed by all participants.In comparison, six out of eight tasks were successfully completed by all participants when testing v.4.A basic task, such as signing up for DMPonline was successfully completed by only three participants (43%) in v.3, compared with all six participants in v.4.Tables 1 and 2 detail the task completion rates.

Time on Task
We recorded the time spent on every task by each of the participants, measured in seconds.Some tasks were inherently more difficult and took longer to complete, which is reflected by the average time on task.These were filling in the plan and registering, doi:10.2218/ijdc.v9i1.312

Getler et al. | 203
which is a two-step process involving email verification.Full results of time on task are in Tables 3 and 4. When we compare performance across the two versions, completion times were quicker overall in v.4 than in v.3.For example, creating a plan took 492 seconds on average in v.3 compared with 313 seconds8 in v.4.A similar pattern emerges for sharing (174 seconds compared with 67 seconds) and exporting plans (192 seconds compared with 116 seconds).Registration took a comparable length of time (186 seconds compared with 199 seconds), although completion rates for this were far higher in the new version (43% in v.3 compared with 100% in v.4).These results suggest that v.4 is easier and more intuitive to use.

Errors
We also captured the number of errors (all unintended actions, slips, mistakes or omissions) participants made while trying to complete the tasks.Tables 5 and 6 display a summary of the test data.The main issues are highlighted in red.These are doi:10.2218/ijdc.v9i1.312characterised by a combination of low completion rate, a high number of errors and an above average time on task.Overall, the number of errors is significantly lower for v.4 than in v.3.For example, sharing a plan in v.3 generated 13 errors, while none were recorded when testing v.4.Similarly, signing up for an account generated eight errors in v.3 compared with only one in v.4, and creating a plan recorded 12 and zero errors respectively.

Usability Problems
We recorded all usability problems encountered by users, and calculated an impact score for each of them.Impact scores were calculated by combining four levels of impact with four levels of frequency, as shown in Tables 7 and 8.The resulting score is used to prioritise issues to be solved in future versions of DMPonline.Overall, we recorded 18 issues in v.4 of DMPonline compared with 37 problems in v.3.Full results are available in Appendix 2.

Expectation Ratings
To help us prioritise which functions needed improvement, we asked each participant to rate how difficult or easy they expected the task to be pre-(expectation rating) and posttask (experience rating).This was rated on a seven-point scale with endpoints of Very Difficult (1) and Very Easy (7), as shown in Tables 9-12.The results were then used to create a scatterplot and mapped onto four quadrants, as shown in Figures 3 and 4.This method allows for quick visual distinction between acceptable results (Don't touch it, Promote it, Big opportunity) and those that will require future iterations (Fix it fast).Expectations about the interaction with a product are an important indicator for overall user experience.The results showed that the interaction with v.4 of DMPonline was positive and enjoyable for users in all but Task 4, which involved completing tables.The results were marginally better for v.4 than v.3.Interestingly, in both rounds of usability testing, participants expected some of the core tasks (e.g.creating, completing and sharing plans) to be harder than they found them to be, suggesting that we can confidently promote DMPonline as an easy to use tool.

Post-Test Questionnaire: System Usability Scale (SUS)
Finally, we collected a subjective assessment of system usability.At the end of each session we asked participants to rate the usability of the tool using SUS questionnaire on a five-point scale with endpoints of Strongly Disagree (1) and Strongly Agree (5).Statements covered a variety of aspects of system usability, such as the need for training ('I could use DMPonline without having to learn anything new'), support ('I thought that I could use DMPonline without the support of anyone else') and complexity of the system ('I felt very confident using DMPonline').doi:10.2218/ijdc.v9i1.312SUS scores range from 0 to 100.The SUS score for DMPonline v.3 was 64.This increased to 87 for the DMPonline v.4 beta, a striking improvement in how users assessed the system.To get a better sense of what this score really means, usability researcher Jeff Sauro uses percentiles.In essence, these tell you how usable your product is relative to the other products used to develop the curved grading scale (see Table 13).A SUS score of 64 has a percentile range of 35-40%, which means that DMPonline v.3 was considered more usable than 35-40% of the products in the Sauro database and less usable than 60-65%.The v.4 beta meanwhile fell within the top bracket of the most usable products.Full results are available in Appendix 3.All participants agreed or strongly agreed that v.4 was simple, easy to use, intuitive and consistent.They felt that the various functions within DMPonline were well integrated and were confident to use the tool without having to learn anything new.User attitudes were much more positive towards v.4, so it appears that the changes made have addressed many of the concerns raised.

An Improvement in Usability
The usability testing of the DMPonline v.4 beta confirmed that the revised tool is a huge improvement.We observed that for all study participants the tool's conceptual model (how the information is organised and presented in the system) matched with their own.Most users found it easy to locate information, to find information a second time, and most importantly, to navigate around the website and complete basic tasks without any significant difficulties.
There was a marked increase in usability between v.3 and v.4 across all measures of assessment.Task completion rates increased, less time was spent on tasks, fewer errors and usability problems were recorded, and expectation ratings and subjective usability scores were better.It appears that the new data model, revised workflows and redesigned interface match users' expectations and needs more closely.The feedback from usability testing was critical and, as part of an iterative design process, enabled us to refine and advance concepts to better meet user needs before the release of DMPonline v.4.It is the first time that the interface has undergone such comprehensive review, and we will continue to test new features regularly with researchers to improve the tool.An iterative approach with greater input from the user community is planned.

Future Plans for DMPonline
The evaluation has provided very useful insights and we are committed to further review to ensure the tool remain useful to the community.Since releasing v.4, we have had several requests for customised version from UK universities and we are enhancing the functionality to meet their suggestions.There are also a range of new features we hope to add in the next phase of work, including a comment feature to allow users to discuss sections when sharing plans, more tailored export options, and better integration with institutional repositories and Research Information Management systems.
Gathering feedback from users has helped us to define a clear plan of action and prioritise which features are most needed.The focus groups demonstrated how much community support there is for DMPonline and the desire to feed into plans.We are continuing to use GitHub so others can download and contribute to the code, and are currently exploring options to work with developers at the University of Leeds to integrate DMPonline with their local systems.We intend to make the most of user input and plan to establish a community group to allow external parties to help steer the future direction we take with DMPonline.With the help of the user community, we can deliver a better tool. Problem 18: Project phases buttons not understood ("that whole lifecycle is not explained anywhere, I don't know how many phases there would be to that funding" "I'm not sure this was the button to start with because it is so much bigger… looked like a graphic" "It's telling me how far I've got through, which I don't really understand" "I only had one stage").
 Problem 19: Lack of 'next' and 'back' buttons on the form (accessibility issues) ("I was expecting to go to next section two here or jump back to section one").
 Problem 20: Users are asked for information they couldn't have at grant application stage, e.g.budget. Problem 37: Confusion regarding placement of buttons 'Open', 'Download'… ("Why are these buttons midway up the form where I haven't got through all the options…").
1. Create an account, 2. Create a new research data management plan, 3. Populate the plan, 4. Export the plan.


Introduce a wizard to help users select a template,  Add a page outlining requirements prior to writing the plan,  Shorten and improve the relevance of the Checklist,  Revise how the Checklist is used in the tool,  Ask funder/institutional questions directly (rather than Checklist questions),  Enhance guidance with more examples and suggested answers,  Provide links to local support and institutional services,  Revise export options to meet any formatting restrictions.doi:10.2218/ijdc.v9i1.312Getler et al. | 199

Table 3 .
Time per task for v.3.

Table 7 .
Scale used to define impact scores -impact criteria.

Table 8 .
Scale used to define impact scores -frequency.
After creating a plan users don't know how to fill it in (had to be prompted by the facilitator).Not very clear straight away what 'Edit Plan' means; confusing it with 'Edit project details'.System doesn't tell users how much effort is required to fill in the plan, e.g. after Steps 1 and 2.
 Problem 14: Problem 15: Problem 17: Confusion regarding labels 'Export' = share, 'Review', and what do they mean.Confusion regarding layout ("why boxes are next to each other… people scroll").Confusion regarding buttons 'Open', 'Download', 'Finished'* ("Back to Plan is like 'Cancel' basically…" "Open I don't know if that means open in the browser or launch Word or whatever" "Open is similar to 'Download'…surely it's the same option…When you click on 'Download' you see the option for 'Open', 'Save' and 'Download').

Table 15 .
Usability problems encountered by users in DMPonline v.4.The 'X's represent users who encountered a problem.For example, participant 3 (P3) encountered problems 1, 8, 9, 10 and 11.Delayed action on a mobile and desktop when user signs up for an account using a desktop computer but wishes to confirm the account on a mobile.
 Problem 1: It is not clear to the user how to add rows to the table in a template.Problem 2: