Developing a Data Management Consultation Service for Faculty Researchers: A Case Study from a Large Midwestern Public University

To inform the development of data management services, a library research team at Kent State University conducted a survey of all tenured, tenure-track, and non-tenure track faculty about their data management practices and perceptions. The methodology and results will be presented in the article, as well as how this information was used to inform future work in the library’s internal working group. Recommendations will be presented that other academic libraries could model in order to develop similar services at their institutions. Personal anecdotes are included that help ascertain current practices and sentiments around research data from the perspective of the researcher. The article addresses the particular needs of a large Midwestern U.S. academic campus, which are not currently refected in literature on the topic. Received 16 November 2018 ~ Accepted 17 January 2019 Correspondence should be addressed to Virginia A Dressler, 1125 Risman Dr, Kent, Ohio 44242. Email: vdressle@kent.edu The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution Licence, version 4.0. For details please see https://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2019, Vol. 14, Iss. 1, 1–23 1 http://dx.doi.org/10.2218/ijdc.v14i1.590 DOI: 10.2218/ijdc.v14i1.590 2 | Developing a Data Management Consultation Service doi:10.2218/ijdc.v14i1.590


Introduction
Researchers are increasingly required to develop plans for long-term data management and sharing, oftentimes in the U.S. by mandate of grant directives such as NEH, NSF and others. Yet data management and related activities are skill sets that are not always inherent ones to many researchers. Libraries have track records of providing support and outreach in a number of their traditional service points (such as reference and instruction) (Kong, Fosmire and Dewayne Branch, 2017). Certain librarian positions within digital library and institutional repository initiatives, for example, require skill sets in digital media management and overarching digital preservation knowledge. We believe that the library is well positioned to be leaders on this topic and provide relevant consultation and services to support these endeavors.
In early 2017, Kent State University Libraries formally addressed expanding consultation services to include data management through a new internal working group. The working group included members of reference, instruction, institutional repository, technical services, and digital projects. A team of three librarians from the working group conducted a survey during the Fall of 2017 to investigate research data management issues and practices at the institution.
The initial goal was to learn about data management services at other institutions and to understand current data management practices among Kent State University faculty. The team also wanted to know what services or programs would be useful to faculty researchers managing or sharing research data. Finally, we wanted to identify other departments outside of the library for collaboration on the provision of data management services. This article will highlight the top takeaways from the survey that were used to defne and implement new services.

Literature Review
A literature review was conducted to examine research articles addressing implementation of data management services within the library, and also includes the review of some higher quality resources and toolkits to assist librarians working within research data topics. The search strategy for the literature review included looking at more recently published scholarly publications, primarily published within the last ten years, focused on data management and academic libraries.

Data Management Services within the Academic Library
The literature emphasizes the importance of research data management and mentions several obstacles to data management and sharing. Patel (2016) states the important role of data in research projects and the benefts of sharing research data along with some challenges in research data management, including copyright, data licensing, erroneous interpretation of data, security, privacy, and a mind-set that prevents some researchers from sharing their data. Whitmire, Boock, and Sutton (2015) highlight the need for standardized metadata as an additional challenge and suggest the library can develop services and training to assist researchers with data management. Funari, in a 2014 article about research data with a focus on a European context and research data in the IJDC | General Article doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 3 humanities, summarizes categories of obstacles to research data access and re-use, including legal, fnancial, and technical. Funari advises that "research data" may be defned differently by different organizations and may differ greatly in quantity and typology, "and also for the degree of necessity and practice in their sharing and reusing" (Funari, 2014) between natural sciences and humanities. Faniel and Silipigni Connaway (2018) performed a qualitative research study to examine the experience of academic librarians with research data management programs to support researchers. Qualitative data was collected through hour-long interviews with thirty six library professionals. The authors highlighted fve factors of infuence which include technical resources, human resources, researchers' perceptions about the library, leadership support and fnally communication, coordination and collaboration. These factors were ultimately found to either act as facilitators for academic librarians to support these initiatives, or as constraints. Further, they call for more subject and technical expertise to address the complex needs of research data. An interesting element that was present throughout the interviews was the fact that some services were still either in planning stages, or in very early stages of implementation, which points to the fact that many of these services are still in their infancy. Core services of these programs centered around writing data management plans, depositing data and/or managing data. The preference for this work by library staff to take place at an early point of the research cycle, an aspect that would entail communication of services, close collaborators with other entities on campus who work regularly with researchers at different points in the project (Offce of Research, Information Services/Information Technology, and others).
Chen and Zhang (2017) examined job descriptions and required and preferred qualifcations for library job announcements that included the word "data" in the job title to determine what knowledge and skills successful candidates for these positions should have. While the study's aim was to inform the curriculum of LIS programs, their fndings indicate the types of data services that libraries offer, or expect to offer, to meet the needs of their institution's researchers. Most of the positions examined required the candidate to be able to assist faculty and students with data collection, data management, and data analysis. Some postings mentioned specifc software or tools, but these varied. Frank and Pharo (2016) used a modifed Delphi method with two rounds in their study of meteorology students and associated stakeholders at the University of Oslo to assess perceptions of data information literacy and attitudes about its instruction for meteorology graduate students. They formed a panel of experts composed of meteorology professors at the University of Oslo's Department of Geosciences' Meteorology Section, researchers from the Meteorological Institute (MET), PhD students from the university's Department of Geosciences' Meteorology Section, and academic librarians from the University of Oslo's Science Library. All panelists agreed that data information literacy skills are important for graduate students in meteorology. There was less consensus about the role of librarians -even among the three librarians on the panel -as stakeholders regarding data information literacy of meteorology graduate students, but all three librarians identifed future roles for librarians. Several obstacles to library involvement in data information literacy training were identifed.
In November of 2014, a survey was sent to Teaching and Research (T&R) faculty and Research faculty at Virginia Polytechnic Institute and State University to learn about faculty researchers' existing practices to organize, describe, and preserve data, and their needs for services and education (Shen, 2016). A "lack of systematic planning and preservation activities" along with limited storage options and "sporadic and informal IJDC | General Article 4 | Developing a Data Management Consultation Service doi:10.2218/ijdc.v14i1.590 documentation practices" were noted. The study also identifed the need for technical support, application of metadata standards, and education. One interesting fnding that Shen notes is that some faculty researchers seemed to be under the impression that addressing issues mentioned in IRB policies, such as confdentiality and sensitivity of data, is akin to data management planning. Goben and Nelson (2018) outline a new initiative from the Association of College and Research Libraries (ACRL), a full day workshop, "ACRL RoadShow, Building Your Research Data Management Toolkit: Integrating RDM into Your Liaison Work". The development and design of the module was a backwards design, in that the desired results were used as the starting point of the course. As Goben and Nelson point out, "As academics across disciplines face increasing need for data management skills, librarians have an opportunity to apply their expertise in this additional realm." This article highlight the importance of integrating data management skill sets into the work of the liaison, who often come across many opportunities for outreach and education through regular job duties. Whitmire, Boock, and Sutton (2015) presented a case study of using survey results to inform the development of data management services at the Oregon State University Library. Their survey asked faculty at that institution about the type and volume of data generated; who performs the tasks associated with research data management in their research teams; and their current practices of metadata creation. A major fnding from their survey was the differences in where faculty stored data; most notably, that over half of the faculty in the colleges of Engineering, Science, and Veterinary Medicine reported storing data on servers that they themselves maintained.

Resources and Toolkits
At University of California Berkeley, a training approach was built to address current knowledge gaps around data management skill sets for all subject liaisons (Wittenberg, Sackmann and Jaffe, 2018). The authors reported that the success of this approach was a higher rate than previous unit-wide efforts to train libraries on the topic. Identifed services were shared in the article, as well as refections on the successfulness of these initiatives; "... the success of their efforts is equally dependent on the process by which they develop these new capabilities" (Wittenberg, Sackmann and Jaffe, 2018). The Librarian Training Program at Berkeley was deemed to be very successful in regard to bringing more awareness and knowledge around research data management to all subject liaisons, regardless of their speciality or expertise. Work will continue to address subject specifc needs, but it is admirable that an all-encompassing training initiative was instilled to educate many of the staff members who work most directly with researchers.
The Data Curation Profle Toolkit from Purdue (Carlson, 2010) is a useful place to begin for an institution seeking guidelines and recommendations in the identifcation and assessment processes of how researchers are currently managing or curating data. The toolkit includes four components; The User Guide, The Interviewer's Manual, The Interview Worksheet, and the Data Curation Profle Template. The manual, worksheet, and template provide some useful frameworks for practitioners to more consistently and methodically capture information around research needs.
In June 2018, the non-proft Joint Information Systems Committee (JISC) released a research data management toolkit, which features options for three different user types to interact with the included resources (researcher, research support, IT specialist). The toolkit compiles information about research data management, policy planning, infrastructure, associated costs, storage/backup, and more categories. There are also IJDC | General Article doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 5 related resources for courses, videos, and other guides to assist in many aspects under research data management.

Materials and Methods
Kent State is a public university, with eight campuses throughout northeast Ohio, with a total student enrollment of over 36,000 students (including both undergraduate and graduate). Kent State University is designated an R2 Carnegie Classifcation, and employs approximately 2,700 academic faculty. Kent State University Libraries currently offers consultation services for researchers on the topics of literature reviews, copyright, affordable course materials, and data analysis software.
The questions the team sought to answer were:  RQ1: What questions should libraries ask of faculty when developing data management services? What sort of data management services would be the most appropriate for our faculty?
 RQ2: What are the knowledge gaps among faculty about data management that the library could help fll?  RQ3: How can the library identify collaborators on campus to participate in the development of data management services?  RQ4: Are there any differences in data management practices and attitudes between disciplines that libraries should take into account when developing data management services?
To investigate these research questions, a Qualtrics survey was distributed via email to all tenured, tenure-track, non-tenure track, and adjunct faculty at all Kent State University campuses (n=2749). The survey was distributed on October 19, 2017, and closed on November 20, 2017, with two reminder emails sent in the interim. In-progress responses were closed after one week. The survey instrument was adapted from Whitmire, Boock, and Sutton (2015), with minor changes incorporated to address our specifc institution.
The data was analyzed using R, an open source statistical software package (R Core Team, 2018). Multiple choice questions were analyzed using frequencies and proportions. Qualitative, open-ended questions were content-analyzed by all contributing authors independently before coming to consensus about fnal coding. Extended response answers were manually coded by the researchers for mentions of aspects such as software, common practices, and other facets.
Of the initial distribution, 287 people started the survey, for a response rate of 10.4%. Twenty-eight respondents were fltered out after answering "Never" to the frst question, which ended the survey for these respondents. In all, there were 259 responses with usable data for some, or all, of the survey questions. There were 180 who completed the survey to the end, for a completion rate 1 of 69.5% (Figure 1).

Demographics
Respondents were asked to self-report their faculty status and college affliation. The majority of respondents were tenured faculty (see Table 1). Note that the 'not applicable' faculty status responses may represent adjuncts who do not consider themselves nontenure-track.
While our research team does want to create inclusive service points around research data, we particularly want to be sure that tenure-track faculty are receiving the support needed to acquire tenure. Future surveys may look to isolate this group for further study to get more information from this particular base of researchers. doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 7 Survey respondents were asked what other library services they have used in the past, and could select as many as were applicable to them. The largest proportion reported assistance from the reference desk or direct consultation with a subject librarian. The subject librarian answer does support the notions addressed in both Wittenberg, Sackmann and Jaffe (2018) and Goben and Nelson (2018) in providing additional education to subject librarians on this topic. If many researchers are currently in active consultation with subject librarians, this approach would be most practical to address research data needs through the existing, regular service points.

Volumes and Types of Data Generated
In our sample, we found that 57% (163) of faculty indicated they generate data always or most of the time at our institution, and 75% (159) generating between one to fve datasets annually (see Table 2). The majority of surveyed faculty indicated that they generate less than 1TB of data per year, with 40% of these indicating less than 1GB.  Table 3 provides a view of what kinds of data the researchers at our institution are most commonly dealing with through their work, with quantitative data having the highest representation at 70.9%. One point that our team found interesting is that audio was rather high on the list, accounting for 38% of reported data types from the surveyed faculty. Framing the types of data most commonly produced in research is helpful in designing services to complement these particular data types. For example, our library staff may decide to focus on audio as a place to address further education for library faculty and staff, such as identifying local transcription services or exploring automated transcription services. doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 9 Non-digital images 18 8.9

Data Management Planning
The middle section of the survey contained questions related to long-term data management planning (see Table 4). We found that very few faculty were actively engaging in data management planning: only 24.6% had developed a data management plan in the last fve years.
Additionally, faculty perceptions about long-term storage and access of data after a project or grant period were often idealistic -42.9% said that the lifespan of their data was "as long as possible" -which may not always be practical or sustainable.
These fndings indicate a current knowledge gap at our institution, and as such, are one of the main identifed areas to focus initial attention by way of the internal working group and outreach/education initiatives. Practices related to publication of research data were still very limited: 1. Only 24.7% had ever published data alongside an article.
2. Only 8.0% had ever used a copyright license with published data.

Methods of Backup and Storage
Consistent with prior studies, the most popular form of storage and backup for research data was a form of physical media such as disks, tape, hard drives, or USB drives (see Table 5). Within all colleges from which faculty answered this question, such physical media were either the top choice or tied for the top choice for data storage after project conclusion. From personal anecdotes, we surmise that this solution refects what is either most available to the researcher without consulting outside units or is simply how data is stored during the project and remains so after the completion of the project or adjoining research paper. Dressler, Yeager and Richardson | 11 Under 30% of the faculty had stored copies of their datasets in a data repository or archive. Interestingly, the proportion of faculty respondents using web-based or cloud servers was quite high at 75%, whereas the proportion of respondents saving data on a university server was about half. Personal hard disks and web-based storage permit a degree of control not present with university-owned servers; faculty may feel that they have easier access to data if it is stored on their own devices or cloud accounts.

IJDC | General Article
In the open-ended responses, we found that surveyed faculty at our institution were often left to fnd their own storage methods for research data (both during and after research projects). This included a heavy use of external media, such as hard drives or thumb drives, or cloud storage (Dropbox and Google Drive being the most prominent). There is an immediate faculty need for a solution that allows easy data sharing, with options to restrict usage based on existing requirements or confdentiality needs. A system with embargo potential would also be of value for some researchers at Kent State University.
Web-based or cloud servers were the second highest choice for the College of the Arts, the College of Arts and Sciences, and the College of Education, Health, and Human Services. Faculty from the College of Communication and Information showed the highest rate of adoption for cloud storage to protect data after the conclusion of a project. Web-based or cloud servers were the second highest choice for the College of the Arts, the College of Arts and Sciences, and the College of Education, Health, and Human Services.

Barriers to Sharing
Respondents could select multiple barriers to sharing from a list (see Table 6). Overall, confdentiality requirements were overwhelmingly the biggest barrier to sharing data (64.7%). Other barriers selected by more than 25% of respondents, were lack of mechanism to share the data (28.2%), insuffcient time to make data available (27.6%), lack of funding (27.1%), and the potential for data to be misinterpreted or misused by others (25.9%). When broken out by college, confdentiality was the top barrier for all colleges except for the College of the Arts; among the faculty in that college, the top barrier was lack of time.

RQ1: What questions should libraries ask of faculty when developing data management services? What sort of data management services would be the most appropriate for our faculty?
Our survey instrument contained questions about the type and volume of data generated; data management roles within the research team; data ownership and rights; long-term storage; and barriers to sharing. Our survey found that education was the most important service that the library could provide to faculty. Librarians can partner directly with faculty who supervise graduate assistants, and provide training on proper IJDC | General Article doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 13 data management practices. They can also provide feedback to faculty who are developing data management plans.
When conducting future research that aims to examine data sharing with outside researchers and scholars after a study is complete, researchers should make a distinction between data-sharing among co-researchers and data-sharing for re-use. Our survey items addressed data management from collection to long-term storage and/or destruction, yet we perceive from write-in answers and follow-up consultations that many respondents were focussed on data practices and sharing among research team members.
Though many respondents were aware of limitations to data-sharing for re-use, those who had chosen to share data most often opted to work with subject-specifc data repositories that have mechanisms in place to ensure the long-term safety, storage, and accessibility of data fles.

RQ2: What are the knowledge gaps among faculty about data management that the library could help fll?
The survey indicated that there is a huge gap present around data management planning, with 75% of surveyed faculty indicating they had either no plan in place, or had not considered the notion of data management. Most of the questions about data management practices included a "Don't know/Not sure" option, and many individuals indicated this uncertainty. Data copyright had the greatest rate of uncertainty (33.5%, Table 4), followed by size of research data generated in year (21.1%, Table 2); developing a data management plan for any research projects in last fve years (14.8%, Table 4); expected lifespan of data (5.4%, Table 4); and publishing data in conjunction with an article (4.5%, Table 4). These fndings indicate a current knowledge gap at our institution, and as such are identifed areas to focus attention by way of the internal working group.
Responses to the "Barriers to sharing" question, vocalized a need for a storage and access solution that is be robust, easy to use, and has a feature to easily share data, with options to restrict usage based on existing restrictions or confdentiality needs. Barring confdentiality requirements, time and funding were cited as the top reasons why researchers are not currently engaging in better research data management practices, and this is an area in which we perceive the library to be best situated to assist researchers.
Several faculty mentioned that they were self-trained on data management practices, and were now in the position of needing to train graduate students. One particularly notable extended response said that "..

. I need better solutions and better training for students on good data mangement [sic] and documentation practices -but I was never trained in any of this myself."
Through follow-up consultations with some survey respondents, we found that some disciplines, such as Biology and Library Science, were found to have some courses and workshops on research data available (or soon to be available), but that the majority of disciplines did not have inlets for budding (or more advanced) researchers to build and develop these skill sets. Therefore, this area is perhaps the place where well-defned research data support within the library could be of the most beneft for KSU researchers. Basic and advanced short library instruction courses could address aspects for researchers to develop skills around data management.
Additionally, researchers who trained and supervised graduate assistants noted that data-sharing practices involving constantly changing research assistants was a source of diffculty in consistency and continuity of practice around data management. We plan to address this in future workshops and classes to assist researchers to develop better practices.
Overall, the general attitude of responses showed that faculty were more concerned with research data management during the research process than 'long-term' data management, the latter being an area where the library (particularly digital librarians with existing skill sets around digital media management) can be of assistance.

RQ3: How can the library identify collaborators on campus to participate in the development of data management services?
The process of developing and conducting the survey helped us to organically discover both institutional partners and faculty partners.
Within the survey, several faculty expressed interest in talking about topic of data management with us further. These faculty members agreed to meet with us informally to talk about their data management needs.
By way of both the survey and other meetings over the past year, the library has found two outside units currently working with researchers on aspects of the data management process. The Offce of Research assists with grant applications and monitors research compliance, and Information Services provides systems support. In the short term, a collaborative working relationship has already increased referrals between the units, and in the long term, these connections can help to develop a comprehensive set of new services.
As we discovered in the analysis of the survey responses, the most candid write-in responses refect many areas for our team to direct resources. In particular, we must strive to better communicate existing services for researchers, and consistently seek feedback through open library sessions. We must also strive to share feedback between units, especially with the Offce of Research and Information Services.

RQ4: Are there any diferences in data management practices and attitudes between disciplines that libraries should take into account when developing data management services?
A question at the beginning of our survey defned 'research data' as "the recorded factual material commonly accepted in the scientifc community as necessary to validate research fndings." However, there were differing notions of the general concept of data between disciplines. Because the survey sample included faculty in all colleges, some respondents outside of natural sciences and social sciences "analogized" their closest equivalent to research data. For example, a faculty member from the Arts said: "I create and photograph artwork but do not have a system for managing it at this time." We believe that this is a basic education point could be an element incorporated into all of the new library initiatives to better defne data.
In terms of differences between disciplines and research attitudes, one respondent from the humanities was critical of some of the wording of certain areas in the survey, saying that the use of "research teams" in some of the questions was biased in favor of practices in the sciences. If researchers intend to address data management practices across disciplines using a broad defnition of data, then care must be taken to avoid language that could be inferred as applicable only to certain disciplines.
Some of the notable similarities across disciplines were preferences for the use of physical media and cloud storage rather than university Web servers or repositories; and the perception of confdentiality as the greatest barrier to sharing. We also observed the need to improve communication about existing services within the institution. doi:10.2218/ijdc.v14i1.590 Dressler, Yeager and Richardson | 15

Conclusion
The survey conducted at Kent State gave the University Libraries a starting place to defne current needs at the institution and has led to the creation of a crossdepartmental working group to address these needs (in conjunction with the Offce of Research and Information Services). Survey results indicate a clear need of basic data management services and also gave indication of some specifc education and systems needs of current researchers. Additionally, getting some of the more informal anecdotes in some survey questions provided an outlet for survey participants to share some of their previous experiences and frustration in the lack of having the needed support around research data. These anecdotes proved to be most useful in follow up conversations with outside units to relay these experiences to individuals who work in areas that can best address these needs. In subsequent conversations with the Offce of Research, researchers in need of consultation with research data management plans for grant applications are now being referred to a member of the research data working group at the library.
General data management education, and research data management (and adjoining practices) are the top priorities identifed in the survey. The information collected in the survey will assist in planning out and implementing new library services, and we also anticipate a follow-up survey once new initiatives are underway, as a point to ensure this work is well guided and also get feedback for continuing to refne services and address any gaps as needed.
Q2.3 Indicate the types of data your research generates. We have included examples of common fle or document types for clarifcation. Select all that may apply.
Quantitative, tabular or structured data (CSV, Excel, SPSS, JSON) Geospatial data (vector and raster data, shapefles, geodatabases) Digital databases (surveys, census data, government statistics) Unstructured text in a digital format Digital images (.tif, .jpeg, .jp2) Data about biological, organic or inorganic samples or specimens Digital gene sequences or digital renditions of biological, organic, or inorganic samples or specimens Audio recordings (analog or digital) Video recordings (analog or digital) Electronic lab notebook(s) Non-digital text (handwritten notes, sketches, fgures, paper lab notebooks) Non-digital images (photographs, etc.) Metadata (.xml, .rtf, .txt) Other (Please specify) ________________________________________________ Q2.4 What would be your best estimate to the size of the research data generated per year (in total)? Note: 1 Terabyte (TB) = 1,000 GB ⭘ Less than 1 GB ⭘ 1 GB -1 TB ⭘ More than 1 TB ⭘ Don't know/Not sure Q2.5 Considering the research data you have generated in the last fve years, who owns the data, and will the data be freely available for users to access? Select all that may apply.
I own the data, and it is freely available for public use I own the data, but there are restrictions on use I am part of a team that created the data, and it is freely available for public use I am part of a team that created the data, but there are restrictions to use Ownership is by another party (Please describe if possible) ______________________ Unknown Q2.6 Some grants require that data must be made available for a set amount of time. When thinking about the research data you've generated in general, how long do you expect for it to be usable? That is, what is the "lifespan" of your data? ⭘ A short, set amount of time (Less than fve years) ⭘ As long as possible ⭘ Unknown/Hadn't considered ⭘ Other (Please explain) ________________________________________________ Q2.7 Please indicate which of the following strategies you use to protect your data from corruption or loss once the project has concluded. Q2.13 Have you ever published research data in conjunction with an article? That is, have you ever made the "raw" data from a research study available alongside a publication?

IJDC | General Article
⭘ Yes ⭘ No ⭘ Don't know/Not sure Q2.14 Please indicate which of the following limits the sharing of your data outside of your research team? Select all that may apply.
Confdentiality requirements (privacy, human subject data, etc.) A lack of funding A lack of standards (data or metadata) An opinion that people don't need the data