Depositing Data: A Usability Study of the Texas Data Repository

Objective : The purpose of this study is to examine the usability of the Texas Data Repository (TDR) for the data depositors who are unfamiliar with its interface and use the results to improve user experience. Methods : This mixed - method research study collected qualitative and quantitative data through a pre - survey, a task - oriented usability test with a think - aloud protocol


Introduction
The Texas Data Repository (TDR) is a consortial platform for publishing, sharing, and archiving data created by faculty, staff, and students at member Texas higher education institutions (Texas Digital Library 2021). The repository (https://dataverse.tdl.org) uses the Dataverse Project software,an open-source application developed and used by Harvard University. The repository was created in 2017 and is hosted by the Texas Digital Library (TDL), a consortium of academic libraries in Texas that provides shared technology services to digital scholarship collections.
The TDR was developed "to make research materials freely available to anyone, anywhere, and at any time," by enabling researchers to publish and preserve research data to meet funding agency and scholarly journal data publishing requirements (Texas Data Repository 2021a). The TDR can host small to mid-sized datasets that are free of confidential or sensitive information since deposits are encouraged to be openly accessible to the public. Deposited data can be from any scholarly discipline and in any file type and allows for the inclusion of readme files and other supplementary documentation. Key benefits of the TDR include the ability to store and organize datasets, version tracking, and the assignment of a digital object identifier to datasets for citation. A liaison librarian at each of TDL's participating member institutions provides deposit assistance to researchers at their institutions. The TDR was developed to be a hybrid service model, with TDL staff hosting the repository but provided flexibility for local institutional control of programs and services. This allows institutions to adjust services based on need and available staffing (Texas Digital Library Dataverse Implementation Working Group 2016).
While multiple universities participate in the TDR, this study focused on researchers' experience of using the TDR at Texas A&M University, which uses a self-deposit model, with support as needed, including workshops, online guides, and consultations. Texas A&M University has the second highest number of users and dataset creators in the TDR (Sare, Chan-Park, & Waugh 2021). Among all institutions, the TDR averages about 15 new users a month and 30 new datasets are added every month, with most of these openly accessible (Sare, Chan-Park, & Waugh 2021).
The purpose of this study was to examine the TDR's usability for researchers who are unfamiliar with depositing data the TDR. The study was conceived when the authors reviewed a select set of deposited datasets generated from a random sample for another project and noticed that some deposited datasets had poor or incomplete metadata. The authors decided to do an assessment of the usability of the TDR's interface to see if there were obstacles researchers faced when depositing their data.
A pre-survey and a task-oriented usability test with an exit questionnaire was conducted to explore the TDR's usability.
The research questions are: 1. Are the data depositors satisfied with their use of the TDR? 2. Can the data depositors use the TDR effectively and efficiently?
3. What training materials and guidance can the libraries provide to depositors to improve the user experience?

Literature Review
Many researchers have described usability as a multifaceted concept, and Jeng (2005) has an excellent exploration of these various definitions of usability. Two of the most widely cited definitions of usability are from Nielsen (1993) and the International Organization of Standardization ISO 9241-11 (2018). According to Nielsen (1993), usability has five attributes: learnability, efficiency, memorability, error recovery, and satisfaction. ISO 9241-11 (2018) has since been updated from the draft (1994) and previous (1998) versions that are often cited and defines usability as "the extent to which a system, product or service can be used by specified users to achieve specific goals with effectiveness, efficiency and satisfaction in a specified context of use" (p. 2). These two definitions are referenced often in library usability studies, and for the purposes of this study the ISO 9241-11 (2018) definition examining effectiveness, efficiency, and satisfaction was employed to evaluate the usability of the TDR, as others have done in their studies (Jeng 2005;Pant 2015;Subiyakto et al. 2021).

Libraries and usability
There have been numerous library website usability studies over the past 20 years, and recently published studies reveal a range of in study focus. Kous, Pušnik, Heričko, and Polančič (2020) use the attributes of usability to frame their study which questions how different types of users experience the library website's usability. For a library website redesign focused on navigation, Ochoa (2020) discovered that the user's understanding of library jargon impacted their ability to complete tasks. Cirelli & Long (2020) conducted a survey to understand what library resources health sciences students' need and use, and how they describe these resources, and then conducted a usability test to examine proposed changes to the library website organization as a result.
In addition to investigating library website usability, library researchers have also explored the usability of external systems integrated within the library website, such as discovery layers (Brett, Lierman, & Turner 2016 Thorngate & Holden 2017). Institutions subscribing to LibGuides or adopting discovery layer platforms have a limited locus of control for making changes to the systems themselves; they may be able to make some level of decision, but overhauling the interface entirely is outside their control. However, usability studies inform changes that the institutions are able to make and as well as inform local best practices for educational interventions and internal policies. For example, Conrad & Stevens' (2019) LibGuides usability testing led to the "development of several data-informed recommendations" for guide creators in their library, such as using the same header as their library's website and avoiding subtabs in the navigation (p. 73). Tonyan & Piper (2019) conducted a usability study on Summon, their library's newly implemented discovery layer, in order to develop best practices for incorporating this tool into their information literacy instruction.

Institutional and data repository usability
As universities have sought to capture the research output of their faculty, there has been an increase of institutional repositories and data repositories that are supported by academic libraries. A search of the literature reveals a trend in beginning to investigate the usability of these institutional repositories or data repositories. Kim and Kim (2008) describe the development of a framework for usability testing for their institutional repository and the subsequent usability testing that focused on satisfaction, supportiveness, usefulness, and effectiveness. They recruited participants that were both experienced and inexperienced with the system for comparison. After the testing and analysis was completed, they used the data to make recommendations for changes to their interfaces, including submission and search interfaces. Subiyakto et al. (2021) conducted think-aloud usability testing on the end-user interface of their institutional repository to measure efficiency and effectiveness of finding data on the site. After completing the usability testing, participants answered a "system usability scale (SUS) Questionnaire" so the researchers could measure satisfaction with the repository (p. 3). To capture efficiency of the site, they measured the time it took the participants to complete each of the ten tasks developed for the think-aloud test, and to measure effectiveness they rated how easily participants could complete the tasks. They found that the overall efficiency of the site was higher than the minimum expectation, while effectiveness and satisfaction both scored slightly below the minimum expectation. Most of the recommendations resulting from testing involved simplifying and streamlining the interface.
In 2013, Gibbs, Lin, & Quigley submitted a final report regarding usability testing conducted on the Harvard Dataverse open-source data repository. Usability testing sessions and interviews were analyzed, and since the majority of the participants had little exposure to Dataverse, the report focused on new users. Two task scenarios were developed: one for users finding data and one for users depositing data (including creating an account). Both task scenarios resulted in recommendations for updating Dataverse based on the usability data gathered, and devised a two-tier approach for implementation.
Since this final report, there have been very few published studies regarding the usability of Dataverse. Quigley (2015) detailed in a poster an iterative testing process for Dataverse which resulted in changes to the taxonomy and faceted navigation, as well as the creation of several ways to access editing datasets.
Kamaludin (2020) used a survey to conduct a usability heuristic evaluation of their university's installation of Dataverse. These 10 usability heuristics were originally developed by Nielsen and Molich (1990) and later refined by Nielsen (1994). Kamaludin (2020) discovered that nine out of the 10 variables measured were favorably met, with the highest being 'flexibility and efficiency of use' and 'help and documentation.' The remaining variable, 'recovery and system,' was rated neutral.
This study contributes to the existing scant literature regarding the usability of a Dataverse repository in the context of one particular university in order to make decisions for best providing support to researchers using the TDR, share with the broader TDR community, and which can also serve as a model for others to investigate their local installations of a Harvard Dataverse repository.

Participants
After receiving institutional review board approval, the authors recruited seven researchers who had never used the TDR before. We used convenience sampling to do the recruitment. Literature (Nielsen, Jakob, Landauer, & Thomas 1993) showed that usability studies do not require a large number of participants. When sample size gets to a certain number, the information obtained from the study will reach a saturation level. Juliet Corbin and Anselm Strauss define theoretical saturation as the point where further "data gathering and analysis [would] add little new to the conceptualization, though variations can always be discovered" (Corbin & Strauss 2008, 263).

Instrument and procedure
The components of the usability study centered around the TDR's efficiency, effectiveness, and satisfaction to understand the first-time data depositors' experience with the TDR. The usability study was performed through three steps. Firstly, seven participants completed the pre-surveys (Appendix 1: Pre-survey).
We used Qualtrics to design and collect the pre-survey data. According to Hom (1998), the pre-surveys can help collect the participants' demographic information and general knowledge/experience about the TDR.
Following current practices, the study was conducted through two primary methods of measurement-a task-oriented usability test observation with concurrent thinking aloud protocols, and an exit system usability scale questionnaire (adapted from SUS and Jeng ( moderator who is a TDR specialist through video conferencing software (Appendix 3: 5 task scenario for data depositors). Another usability specialist took detailed notes and recorded the sessions.
For the 3rd task, participants were asked ahead of time to bring their own data file. Since participants are asked to complete metadata fields describing the dataset, it was necessary that they be familiar with the data and not use a 'dummy file.' Even if participants did not bring a file, they would be able to input metadata based on their knowledge of their real research dataset. We wanted to replicate an authentic data depositing experience as much as possible within the usability test.
These 5 tasks are a typical process for first-time users to deposit data in the TDR. Participants were required to verbalize their thoughts, feelings, opinions, and decisions while working through five tasks in the TDR usability test. According to Hammill (2003), a think-aloud protocol is widely used in usability testing. The think-aloud protocol helps to contextualize users' misconceptions, expectations, motivations, satisfaction, and frustrations with the system being tested. The participants' activities were observed to measure the efficiency and effectiveness of the system, and their thinking out loud responses were noted to assess the satisfaction of the user experience.
In the last data collection step, the participants were provided with an exit questionnaire asking participants about their experience with the TDR. The exit system usability scale questionnaire was used to collect information about users' experiences and reflections after using the TDR, which is generally used as a measure for satisfaction (Kous, Pušnik, Heričko, & Polančič 2020).

Data analysis
This is a mixed-method research study. Qualitative and quantitative data were collected through observation notes and surveys. Both quantitative analysis (i.e., descriptive statistics such as the task completion time) and qualitative analysis (e.g., content analysis of the thinking-aloud protocols) are employed to examine the TDR's usability for the first-time data depositors at Texas A&M University. We used conventional content analysis to identify coding categories from the thinking aloud protocols (Hsieh & Shannon 2005). To increase the credibility, every coding task was performed by two coders. By taking this mixed-methods approach, the authors used the qualitative memorandums to provide insight to the quantitative results (Creswell 2003, 208).

Participants' demographics
Seven researchers from different departments across the university participated in the study. The participants were from diversified units including the College of A Usability Study of the Texas Data Repository JeSLIB 2022; 11(1): e1233 https://doi.org/10. 7191/jeslib.2022.1233 Education, College of Liberal Arts, College of Engineering, and the University Libraries and included graduate students and faculty members. Among the seven participants, only one of them had used a data repository before. Even though five had heard of the TDR, none had ever deposited data to the TDR before. Detailed demographic information is included in the following table (See Table 1: Participants' information).

Efficiency-Task completion time
The authors attempted to measure the efficiency of the TDR through the completion time at each task (Kous, Pušnik, Heričko, & Polančič 2020). Generally, participants finished the first task easily. In the first task, participants were required to log into the TDR through the institutional login and create their account. The average completion time for the first task was 2.42 minutes, with a median of 2.00 minutes. More than half of the participants finished the first task within two minutes.
In the second task, the participants were required to set up their dataverse collection for their research project. The average completion time for the second task was 6.14 minutes, with a median of 6.00, which means that more than half of the participants took more than 6.00 minutes to complete the second task. One of the participants took 9.00 minutes to complete the second task.
The third task, comprised of two congruent tasks, asked participants to 1) create a dataset in their dataverse collection and 2) upload a data file for which the dataset was created. The average completion time for this task was 5.85 minutes with a median time of 6.00 minutes. More than half of the participants needed more than six minutes to complete this task and one was unable to complete the task. While the average and median time needed to complete task three was comparable to that of task two, one participant required 10.00 minutes to complete this task, and another took 9.00 minutes to complete the task. Time to complete this task does not indicate a lack of efficiency for the participants.
The fourth task asked participants to return to the dataset that they created in task three and complete additional metadata fields. The authors did not provide direction for which fields to complete but allowed participants to choose from all of the optional fields available. The average time to complete this task was 7.52 minutes, the longest average time for all five tasks. The median completion time was 6.00 minutes. One participant took 14.00 minutes to complete the task, while another took only 4.00. The range of time to complete the task was due to the autonomy of each participant to choose the amount of metadata they wished to include. Additionally, some participants also chose to give up at this task due to the number of options available making their time to complete much shorter than expected.
The final task asked participants to return to the data file they uploaded in task three and add additional, file-level, metadata. The average time to navigate for this task was 3.14 minutes and the median time was 2.00 minutes due to the fact that some participants failed to complete this task after several attempts. think-aloud indicate how accurately in which task the users can achieve specific goals in particular environments and whether they complete the task successfully (ISO 9241-11, 2018;Kous, Pušnik, Heričko, & Polančič 2020).

Effectiveness-Task completion observation
To measure effectiveness of the TDR, task completion was analyzed using a scale modeled on those used in other studies (Gibbs, Lin, Quigley, & Tang 2013;Kous, Pušnik, Heričko, & Polančič 2020). The four-part scale included: completing the task independently; completing the task with little help from the facilitator; completing the task with significant help from the facilitator; and failing to complete the task. If the participant was able to complete the task without any assistance or re-direction from the facilitator, the task was coded a 4. If the participant, after attempting to complete the task, needed a small redirection or assistance from the facilitator, the task was coded a 3. If the participant needed multiple redirections to complete the task, the task was coded a 2. Finally, if the participant failed to complete the task or gave up on trying to complete it, the task was coded a 1. Two of the authors coded the recordings of the participants completing each task and then discussed any disagreements in coding and arrived at the final code for each task for each participant.
Given the complexity of some of the tasks and the number of fields to fill out, it was normal for the participant to need time to consider the layout and content. Only when the participants began to navigate away from the task in question, or verbalized frustration or extreme confusion would the facilitator intervene. This was considered minimal assistance. If, after the initial assistance, the participant still seemed lost or confused the facilitator would offer more assistance to get them back on track. This was considered significant help. At this point some participants opted to continue attempting the task and others would elect to move on to the next task. If they gave up or did not finish, the task was considered a failed task (see Figure 2). Task 1: Logging into the TDR through the institutional login. In this task, the authors wanted to observe if participants could navigate to the main page of the TDR to create their own account. All participants were successful in finding where to log in and were used to the institutional login. One user did not realize that she/he needed to click the "Create Account" button to finish logging in.
Task 2: Setting up your dataverse for your research project. In this task, the authors wanted to observe if each participant could create a personal dataverse which would serve as a container for new datasets. In general, most users struggled to find where to begin to set up a dataverse collection. As part of the facilitator script, the difference between a dataverse and a dataset was explained; however, four out of the seven participants were unable to complete the task without some level of assistance. When given the instruction to create a dataverse collection, users attempted to find a button labeled "Dataverse." However, users must click "Add Data" to reveal the drop-down option of "New Dataverse." Once on the page to create a dataverse, some participants found that the fields on the page were confusing, such as the "Identifier" and "Category" fields. The participants generally were not familiar with the term "metadata." Most of the users used the help feature (question mark icons) to understand the terms they were not familiar with (Figure 3).
Task 3: Adding/uploading a dataset to your dataverse. In this task, the authors wanted to see whether participants were able to upload a single data file to their newly created dataset. Only one participant required assistance to complete this task. Participants were familiar with the "Add Data" button due to task two and were able to quickly click the "Add Data" button to select the "New Dataset" from the dropdown options.
Once the dataset is initially created, participants are required to complete citation metadata fields describing the dataset. There are eight required metadata fields and five optional fields. Some participants noted confusion about the purpose and terminology of some fields. For example, three fields are for unique dates associated with the dataset (e.g., creation date vs. deposit date). Some participants had to use the help feature to better understand the purpose of the field. Moreover, all three fields are fixed to include a specific date format, which was difficult for some participants to input correctly.
The "Identifier Scheme" and "Identifier" fields prompt users to include a unique author identification number from the scheme of their selection (e.g.: ORCID). This field is not required, and many participants left it blank because their identifier information was not easily accessible. Most participants also needed to use the help feature to understand the purpose of this field as the meaning of "Identifier" was unclear.
A similar bottleneck occurred when participants entered optional keywords associated with the dataset. Once they entered a keyword, they were prompted to enter the name of the controlled vocabulary standard for that keyword. The participants were unfamiliar with the term "Controlled Vocabulary." The help feature for this field provides the acronyms "LCSH" or "MESH" as examples of vocabularies, but participants were unfamiliar with these examples. All participants left this field blank.
When prompted to enter a subject for the dataset, participants were presented with thirteen disciplinary subjects (e.g., engineering, law, social sciences) and the option to choose "Other."Some participants felt that their research didn't quite fit into these categories, but when they chose "Other" they were not given the opportunity to enter an alternative subject.
When participants were asked to upload a data file to their dataset, those who had a file completed the task with little issue. Participants noted that the drag and drop feature was particularly helpful and depositing the data was very simple. Two participants did not have a data file available but were still able to add information about their dataset in this task.
Task 4: Add/edit your metadata. In this task, the authors wanted to see if the participants were able to return to the dataset metadata to complete additional fields that were not required in task three. Participants quickly identified the "Metadata" tab for their dataset and selected the "Add + Edit Metadata" button to complete the task. Five participants completed this step independently, while two failed to complete the task. Once on the metadata page, participants could view the metadata they filled in when initially creating their dataset. For the citation metadata previously entered, the same eight required fields are present. However, once a dataset is created and saved, the interface adds an additional 21 optional fields (versus the original five optional fields) for a total of 26 optional fields. Additionally, when creating the dataverse, participants were given the option of including more subject-specific metadata fields (e.g., geospatial or humanities). If they chose any of these subject-specific metadata fields or kept the TDR defaults (all available subject-specific metadata fields), their metadata field options more than doubled. Some participants scrolled up and down the page multiple times, opening and closing accordions with the different metadata options, verbalizing their confusion and dismay at the number of fields. When adding or editing metadata, some participants entered "n/a" in fields for which they had no content, rather than leaving them blank. All participants made use of the help feature at some point in this task to understand the purpose of several fields.
Task 5: Editing your file description. In this task, the authors wanted to see if the participants could describe the contents of their data file using the "File Description" field. However, only one participant successfully completed this task A Usability Study of the Texas Data Repository JeSLIB 2022; 11(1): e1233 https://doi.org/10. 7191/jeslib.2022.1233 and six participants failed to complete the task. Despite being on the "Files'' tab, most participants had difficulty locating the "Edit File" button and instead went to the "Metadata" tab for the dataset and selected "Add + Edit Metadata." This resulted in some updating the dataset description, rather than the file description. Most participants failed to complete the last task, due to difficulty locating the field and/or general fatigue; two participants failed to complete this task because they had no file associated with their dataset.

Reports from the exit questionnaires
An exit questionnaire was employed to examine usability issues related to satisfaction with the repository (Subiyakto et al. 2021). The questionnaire consisted of 13 Likert items on a five-point scale, with five points given to the most positive response, to make the qualitative data quantifiable. Mean, median, and standard deviation were calculated for each individual question. Results were summarized in Table 2.
Responses to the exit questionnaire suggest that these first-time users' overall experience with the TDR was generally positive. For instance, the mean for the first question "Your overall reaction to the TDR." is 3.71 and more than half of the participants chose "somewhat satisfied" or "very satisfied" for this question (median is 4). For the second question "I will use the TDR again," the mean is 3.86. Except two participants chose "somewhat disagree" or "neither agree nor disagree" (one for each), and the other five participants indicated that they would use the TDR again by "strongly agreeing" (n=2) or "somewhat agreeing" (n=3). However, in some aspects, TDR also has space to improve. For instance, the mean for the question "The TDR is visually appealing" is 3.14, and the median for this question is 3, in which more than half of the participants choose "probably not" (n=2), and "neutral" (n=3), only two chose "definitely yes" or "probably yes" (one for each).

Discussion
The focus of this study was the experience of first-time users of the TDR, which provides a specific lens through which to view usability of a platform. Because this is a new service, we anticipate that most of the TDR users are first-time users. Additionally, this study examines the user as a depositor, positioning the participants as content creators rather than a content consumer. These are key facts in examining the user experience for the TDR. An experienced user may be able to offer more nuanced observations and a user searching the repository to find data would provide a distinctive viewpoint from which the user experience can be studied.
In the exit survey data, users were positive about the overall look of the repository and indicated that they would use it again. Looking at the observations, the A Usability Study of the Texas Data Repository JeSLIB 2022; 11(1): e1233 https://doi.org/10.7191/jeslib.2022.1233 efficiency data suggest that users can move quickly through the TDR to complete tasks. The effectiveness data also propose that users can often complete the tasks necessary to deposit data without significant help. Like other studies, the authors examined these three measures, which suggest positive usability at face value (Kous, Pušnik, Heričko, & Polančič 2020). However, direct observations through content analysis in the study unveiled how users are hindered in their efficiency and effectiveness due to lack of understanding of the repository structure or thrown by terminology used to guide content creation.
Sometimes the participants' satisfaction might not necessarily reflect their experience comprehensively and accurately. For example, in the think-aloud process, the majority of the participants had difficulty understanding the difference between a dataverse and a dataset, and usually found adding/editing metadata overwhelming but this was not reflected in their answers to the exit questionnaire. The question "The TDR is easy to use" still obtained a mean at 3.29, with a median of 4. This contrast is precisely why usability testing is important. If the user's subjective satisfaction or opinion was the only measure, you would miss the actual usability issues encountered with the system (Nielsen & Levy 1994).

Core concepts hinder efficiency and effectiveness
Thinking about one's own data to complete the various metadata fields in the TDR is difficult for some participants. This activity required participants to pivot their mindset from a data creator to that of a data curator. They are no longer generating factual information to validate research findings. Instead, they must describe the data as a file, or an item that will be shared with others.
Understanding what others might need to know to make the data reusable was not apparent to the first-time users in this platform, which created confusion throughout the user experience.
The observations from multiple steps suggests that the purpose of robust metadata was not understood by everyone. Indeed, even the term "metadata" was unfamiliar as some noted when they thought aloud. The terminology used to guide the depositor through metadata creation accommodates a curator's understanding of a data repository but doesn't necessarily translate to a researcher's understanding of their data. For example, the term "Identifier" confused many participants. Additionally, the help feature was useful for participants when additional context or guidance was needed, but sometimes it led to more confusion due to the terminology and examples used. For instance, many participants were confused by the example for controlled vocabulary (see Figure  4).
Another hindrance was a confusion around the TDR's framework for metadata. As a Harvard Dataverse platform, the TDR allows all users to create their own unique dataverse collection, which contains four levels of metadata: dataverse metadata, citation metadata, domain-specific metadata, and file-level metadata. This tiered framework was explained to participants at the start of the study, but many still A Usability Study of the Texas Data Repository JeSLIB 2022; 11(1): e1233 https: //doi.org/10.7191/jeslib.2022.1233 had difficulty navigating the four levels. For example, when asked to edit the data file's metadata, one participant selected the "Metadata" tab and changed the text of the dataset's description. The participant mistakenly performed the task at the wrong level. This is the strongest example of an overarching usability issue. The platform's tiered model for metadata description is core to its function, but many of the participants did not have an accurate mental model of the platform, which left them scrolling up and down the page or jumping back and forth between different tabs and pages to perform a single task.

Service interventions
Many of the user experience issues identified in this study can be mitigated by technology solutions, but some may also be overcome through service interventions. Similar to studies regarding other third-party platforms, such as discovery systems and LibGuides, the authors do not have complete autonomy for significant redesign of the platform. For example, changing the language used in the guidance for each metadata field should be done at the consortial level of the Texas Data Repository upon agreement of all members of the consortium. Within the authors' direct locus of control, however, is the guidance provided to users before they engage with the TDR and curation support once data is deposited. The high usage of the TDR at this university suggests users are interested in depositing. And, based on Q3 and Q5 of the exit questionnaire, these depositors are likely to value support, such as the interventions we suggest below.
Pre-deposit intervention. To allay the cognitive load of first-time users who assume the role of a curator in order to navigate an unfamiliar framework, librarians can be clearer about key terminology used in the repository and offer instruction about these terms. Removing jargon is not a panacea and does not provide a solution to a learning need. In order for users to shift their perspective to one of a curator, they must learn core concepts like metadata and understand how the TDR is structured to support curation actions in four different levels of metadata. Instruction can support users as they navigate from their current understanding to a new level of cognition, which can build confidence and eventual mastery (Bates 2019).
Librarians can create effective tutorials and checklists for users to teach core concepts and prepare them for the deposit. Tutorials can take the form of a short video or a combination of written and graphical instructions. The tutorial should be placed prominently at the beginning of the deposit process on the Libraries' website. In addition to teaching how to navigate the interface, this tutorial should share the core concepts described above. Additionally, the authors observed that some participants left fields blank because they were not sufficiently alerted to information that they should have at hand to enter as metadata such as their ORCID number, or the citation to a related publication. In addition to a tutorial, a checklist of information needed by users to complete different metadata fields, would help prepare them to complete these tasks at the time of need.
Post-deposit intervention. Even the most effective user instruction has limitations. Researchers may not have the time or motivation to shift their viewpoint to that of expert curators and additional curation support may be needed (Data Curation Network 2019). However, the Libraries can develop an additional service intervention to augment metadata and documentation post-deposit. A curation service where librarians review the deposit and provide users with suggestions for augmenting metadata and documentation could improve discovery and reuse of their data. Model curation services are offered by several academic libraries but scaling a similar service at Texas A&M University poses some sustainability and staffing concerns (Hudson-Vitale et al. 2017). However, the findings from this study, based on the lack of effectiveness of metadata completion in tasks 3-5,suggest that there is a need for curation services, so one possible next step is to pilot a service to test the scope and sustainability.

Future directions
While this study provided several initial impressions of potential changes to the platform itself, the authors understand that further usability studies are needed before recommendations can be made to the consortial TDR partners or the Harvard Dataverse community. For example, this study focuses entirely on first-time users, but further investigation is needed with experienced users. Recommendations for stakeholders come from a place of discovery and there is more to discover about users of the repository before advocating for change with stakeholders. Additionally, further inquiry is needed on how librarians can support researchers as they transition from the role of data creator to that of data curator. As more demands are placed on researchers to curate data, this transition in identity will require guided practice. Future directions could also test usability of the tutorials and whether they are findable on the library page when seeking support or if they successfully provide aid for completing tasks if they are on hand.
A Usability Study of the Texas Data Repository JeSLIB 2022; 11(1): e1233 https://doi.org/10. 7191/jeslib.2022.1233 Conclusion Usability studies of data repository self-deposit models have the potential to reveal both positive and negative insights into the user experience. The study and research questions constructed to evaluate the efficiency, effectiveness, and satisfaction of the platform can provide a clear roadmap for study design and tangible evidence of user satisfaction and potential areas for improvement. This study outlines a triangulation of methods to answer the research questions including pre-and post-surveys and direct observation through a usability study with a think-aloud protocol. These methods shared unique perspectives of the user experience and combined to suggest two core themes. 1) users are often unattuned to the purpose and importance of metadata and require a more conceptual understanding of its use in making data findable and reusable, and 2) first-time users of a Harvard Dataverse platform do not have a clear understanding of the platform's framework, or a clear model for the outcome of the data deposit. Data curators and repository managers can benefit from the insights posed in this study by augmenting guidance they provide for researchers and developing curation services in order to improve the user experience. Those who have expertise in metadata and curation are uniquely positioned to collaborate with researchers in the self-deposit of data to assure that reuse is possible because the data will be aligned with best practices for open data such as the FAIR data principles (Wilinson et al. 2016). Additionally, conducting a user study such as this usability study can inform the framework of a data curation service by providing user data regardless of platform or deposit model at their institution.
The methods used in this study can be a model for other libraries to investigate the usability of their data repositories to improve the user experience of data depositing. These can then be used to identify similar training or service interventions or develop system customization solutions, if possible.