Evaluating the Effectiveness of Data Management Training: DataONE's Survey Instrument

Effective management is a key component for preparing data to be retained for future long term access, use, and reuse by a broader community. Developing the skills to plan and perform data management tasks is important for individuals and institutions. Teaching data literacy skills may also help to mitigate the impact of data deluge and other effects of being overexposed to and overwhelmed by data. The process of learning how to manage data effectively for the entire research data lifecycle can be complex. There are often multiple stages involved within a lifecycle for managing data, and each stage may require specific knowledge, expertise, and resources. Additionally, although a range of organizations offers data management education and training resources, it can often be difficult to assess how effective the resources are for educating users to meet their data management requirements. In the case of Data Observation Network for Earth (DataONE), DataONE’s extensive collaboration with individuals and organizations has informed the development of multiple educational resources. Through these interactions, DataONE understands that the process of creating and maintaining educational materials that remain responsive to community needs is reliant on careful evaluations. Therefore, the impetus for a comprehensive, customizable Education EVAluation instrument (EEVA) is grounded in the need for tools to assess and improve current and future training and educational resources for research data management. In this paper, the authors outline and provide context for the background and motivations that led to creating EEVA for evaluating the effectiveness of data management educational resources. The paper details the process and results of the current version of EEVA. Finally, the paper highlights the key features, potential uses, and the next steps in order to improve future extensions and revisions of EEVA. Received 14 January 2017 ~ Accepted 8 December 2017 Correspondence should be addressed to Chung-Yi Hou, National Center for Atmospheric Research, P.O. Box 3000, Boulder, CO 80307-3000, U.S.A. Email: hou@ucar.edu An earlier version of this paper was presented at the 12 International Digital Curation Conference. The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution Licence, version 4.0. For details please see https://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2017, Vol. 12, Iss. 2, 47–60 47 http://dx.doi.org/10.2218/ijdc.v12i2.508 DOI: 10.2218/ijdc.v12i2.508 48 | Evaluating the Effectiveness of Data Management Training doi:10.2218/ijdc.v12i2.508


Introduction
With the development and improvement of digital technology, scientific advancement is increasingly driven by data. However, as data rapidly proliferates, it is also becoming more challenging for scientific researchers to understand, analyze, and synthesize Big Data, or data with the characteristics of having large size/volume, high complexity, as well as requiring a variety of processing technology (Ward and Barker, 2013). Likewise, for those who support scientific research, such as librarians and information professionals, and for those whose work is impacted by scientific output and products, such as students and educators, being able to manage and ameliorate the effect of 'data deluge' (Borgman, 2010;Hey and Trefethen, 2003) will be crucial in optimizing the full potential of scientific data. Consequently, in order to uphold and realize eScience's core tenet of enabling "data [to be] available, and easily accessible by all" (Wright et al., 2007), it is vital that everyone who works with data has and continues to build, share, and contribute to knowledge and skills in data management.
Understanding and learning about data management can be a complex process. The concept of 'data management' may take on different meanings depending on the specific applications. For instance, within the context of scientific data, data management can often be synonymous with 'digital curation,' which is defined by the Digital Curation Centre as "maintaining, preserving and adding value to digital research data throughout its lifecycle" (n.d.). Additionally, the Data Observation Network for Earth (DataONE) Data Lifecycle 1 depicts eight different stages that data may pass through: Planning, Collection, Assurance, Description, Preservation, Discovery, Integration and Analysis. This is a prime example for demonstrating the variety of representative stages that could be associated with data management. It should be noted that not all research pathways will require data to go through every step of this lifecycle. Nevertheless, each stage of the data lifecycle requires specific activities to be performed in order to ensure that the desired overall data management results can be achieved, and there are a multitude of education and training resources that can support researchers in meeting these requirements.
With the goal of helping members of the community easily access data management training and education resources, DataONE has developed a series of 'Best Practices' for data management that are grouped around the eight stages of the data lifecycle. These also exist as a series of education and training resources for both online learning and face-to-face instruction. 2 When delivering these resources, the DataONE Community Engagement and Outreach (CEO) Working Group 3 actively collects user feedback to ensure the materials are meeting their objectives and remaining relevant to the needs of the community. The DataONE team strives not only to create and deliver high quality education and training resources, but also to measure the effectiveness of those resources. Therefore, recognizing the importance of conducting systematic evaluations and the value the results could have in contributing to the design and enhancement of the education and training resources for managing data, DataONE developed a standardized tool for systematic evaluation of education and training resources across all formats.

Background and Rationale
Supporting the continuing education and training of professionals through a variety of formats, including webinars and seminars, is a practice that has been developed and well adapted by many professional disciplines. However, while continuing education is "the process of engaging in educational pursuits with the goal of becoming up-to-date in the knowledge and skills of one's profession" (Weingand, 1999), there are challenges in achieving this goal. Specifically, among the various areas for consideration such as political, social and economic, finding "better ways to integrate continuing education, both its content and its educational design, into the ongoing individual and collective practice of professionals" (Cervero, 2000) has been discussed as a critical issue when building and implementing continuing education programs. Additionally, it has been noted that a wide range of factors could affect the quality and value of a continuing education program (Hoyt and Whyte, 2011). Therefore, in order to optimize the programs' outcomes and maximum the potential successes, it is crucial for organizations who provide education and training aimed at working professionals to evaluate their programs carefully. The evaluations and their corresponding results can assist not only in gauging the program's value by determining whether the identified needs and objectives have been met (Shapiro, 2009), but also in suggesting measures to be taken to ensure that the programs and learning objectives can align effectively with those of the individuals who participate in the educational or training programs.
DataONE is focusing on supporting data management through its online and inperson education and training resources, which are created and maintained by members of the DataONE CEO Working Group. The online resources include: 1) training modules and accompanying exercises, available in PowerPoint and PDF format respectively, that can be accessed and viewed online as well as downloaded and incorporated into other teaching materials as needed, and 2) webinars that can be accessed freely and streamed directly via the Internet. DataONE also sponsors in-person events, for which the activities can be organized as presentations, workshops, or seminars. DataONE customizes the training topics and duration (from one hour to several weeks) based on the attendees' learning objectives and size in order to help in addressing their specific data management concerns.
Because DataONE is "committed to engaging a broad and diverse community of users, and engaging students and citizens in science through efforts that span the entire data life cycle, from data gathering, to management, to analysis and publication" (n.d.), it is vital that these education and training resources remain useful and relevant to all the users in the community with respect to their data management requirements. In addition to participating frequently in collaborative training activities with other organizations in order to expand the user base, one primary method for quantifying the success of DataONE's contribution to its community is to determine the value and effectiveness of the education and training resources as recognized and acknowledged by users. Hence, as members of DataONE's CEO Working Group, the authors have developed an accessible tool to evaluate the effectiveness of the data management education and training resources.

Methodology
In framing the focus of the evaluation tool, the authors apply the definition for evaluation as "a systematic determination of merit, worth, and significance of something or someone using criteria against a set of benchmark standards" (Ministry of Interior and Japan International Cooperation Agency, n.d.). Additionally, in order to leverage existing best practices and lessons learned when developing evaluation criteria, we conducted a broad literature review to inform the design and features for this tool. Specifically, we focused on the following topics:  Evaluating the effectiveness of education and training resources that are delivered:  In person  Via the Internet (including with videos/audios and either synchronous or asynchronous)  Text-based (such as handouts and worksheets, and including presentations in PowerPoint format)  Best practices/guidelines regarding evaluation design and development  Available survey tools and platforms The results of these literature reviews helped in informing the structure and the details of the evaluation tool, and the specific information learned from each literature review is discussed as follows.

Evaluating the Effectiveness of Education and Training Resources with Different Delivery Modes
A literature review focused on the first topic group found that Kirkpatrick's 'Four Levels of Learning Evaluation' framework, which was developed specifically to address how to evaluate training programs (Kirkpatrick, 1979), provided a four-category outline that could be used as the key guideline for structuring an evaluation tool.
 Level 1: Reaction (Reaction of the Participants) -a measure of customer satisfaction;  Level 2: Learning -the extent to which participants change attitudes, increase knowledge, and/or increase skill as a result of attending a program;  Level 3: Behavior (Change in Job Performance) -the extent to which a change in behavior occurs because someone attended a training program;  Level 4 Results (Organizational Performance) -a measure of final results that occurred because a person attended a training session (Employment Security Department, Washington State, 2010; Accounting-Management, 2013).
Also, for purposes of the evaluation granularity, the intent was to be as detailed as possible. As a result, the evaluation tool could also address four 'depths' i.e. learning step (conceptual level), unit (course level), curriculum (program level), and project (institutional level) (Barker, 1999). Further, as the tool should be adaptable for evaluating the various modalities of the education and training resources, including IJDC | General Article doi:10.2218/ijdc.v12i2.508 Chung-Yi Hou et al. | 51 those that the DataONE CEO Working Group had produced and published, different evaluation areas within the tool could be emphasized according to the resource delivery format. In other words, while the evaluation for content structure/logic, topic knowledge, as well as language used could be evaluated for all resource types, the tool could offer optional evaluation categories for specific resource formats shown as follows:  In-person: teaching style and facilitation, e.g. how well the resources and their supporting components (such as topic discussions and group activities) were organized in order to suit live instruction and learning;  Online: technology and visual aids, e.g. how well the technologies were employed, such as in the user interface and web design, in assisting the delivery of the resources;  Text-based documents (including both presentations in PowerPoint format and any forms of written documents, such as handouts and leaflets): precision and timing, e.g. whether the resource was able to explain the topics clearly and succinctly.

Best Practices/Guidelines Regarding Evaluation Design and Development
After establishing an understanding of the evaluation criteria that should be part of the evaluation tool, the second literature review was performed to determine the best choice for soliciting and receiving evaluation feedback. The literature review gave us the opportunity to consider methods for both in-person and virtual evaluations. The results showed that even though there were several options available for administering an evaluation, including interview, questionnaires/survey/polls, and observations (Johnson, 2008), not all evaluation methods could be readily delivered via the Internet. This was an important aspect to consider because most users access and utilize DataONE data management resources in the online environment. This meant that the evaluation tool needed to facilitate quick and convenient solicitations of user feedback. Ultimately, when other organizations reuse the same tool to evaluate the value and effectiveness of their own education and training resources, this tool would need to be easily customizable. Therefore, we decided that a digital survey would be the most applicable evaluation technique as it would be the quickest method to reach a majority of DataONE community members as well as to allow easier access and reuse by others. In the case of in-person education and training events, the digital survey could be converted to a printed format, so that the evaluation could be completed either online or as a face-to-face discussion between the event organizers and attendees.

Available Survey Tools and Platforms
Once we decided to use a digital survey as the evaluation tool format, we also conducted a literature review to compile and study the best practices and guidelines regarding the constructions and implementations of digital surveys. Because of this decision, the review focused on techniques that were applicable for the online environment. From the review, we learned that the following areas should be considered when building a survey: goals/objectives, targeted audience, evaluation questions, structure/flow, response collection methods, related policies, and logistics, with special attention given to the syntax of the evaluation questions and the questions' possible responses. Additionally, there are three major categories of survey tools available as options for implementation: web-based platform providers, apps/widgets, and software packages. In the case of DataONE, because there was existing experience and infrastructure with web-based tools, the evaluation survey would also be implemented using a web-based platform.

Result
DataONE's Education EVAluation tool or EEVA is the current revision of DataONE's evaluation survey instrument for education and training resources, and it is publicly available and freely accessible from DataONE's website. 4 Based on the knowledge collected and studied through the literature reviews, we focused on creating EEVA with the following characteristics that could be used to evaluate the effectiveness of data management education and training resources:  In the form of a digital survey;  Included the structure and content, as recommended by the best practices/guidelines of survey design;  Customizable to evaluate various education and training modalities and be implemented using a web-based survey platform.
Overall, EEVA consists of two key components described as follows.
The first component is a recommended outline describing the sections that should be included in the evaluation survey. There are a total of six major sections plus an additional ten sub-sections, categorized based on the specific evaluation areas, under the 'Evaluation Question' section. The complete listing and descriptions of the sections are included in Appendix A.
Among the six major sections of the recommended survey outline, the content for most of the sections, such as survey title, introduction, instructions, and thank you note, would be created based on the actual evaluation scenarios. As a result, EEVA's users could follow the recommended survey structure, but would have to provide their own content for these sections.
For the 'Evaluation Questions' section, this content could be readily retrieved from DataONE's website and is the second key component for EEVA. The suggested evaluation questions were compiled primarily through literature reviews and based on existing sample questions. However, the final wordings for the questions had been adjusted so that the questions are applicable to the evaluation of data management education and training resources. The 'Evaluation Questions' section is organized into ten sub-sections as shown in Appendix A, and there are a total of 89 suggested evaluation questions. Appendix B shows the number of questions available for each evaluation category. In addition to providing possible wording and corresponding responses for each question, there are three different filters that could be used to assist users in deciding and finalizing the evaluation questions to include in a survey. Furthermore, the resulting questions could be exported using three different formats: .doc (Document), .xls (Spreadsheet), or .qsf (Qualtrix). The different export formats were selected to help facilitate integration of the evaluation questions with other survey tools. Ultimately, EEVA is designed in such a way that its users could review and select or modify quickly and efficiently the desired questions for their respective evaluations as appropriate. Appendix C shows a partial view of EEVA's home page at DataONE.

Discussion
During the creation process for EEVA, it is important to note that we understand that research design is a formal process and requires in depth study. The literature reviews performed were aimed to collect common recommendations and not meant to cover the full research design process in detail. As a result, it is vital that the users of DataONE's EEVA understand that they are welcome to review the tool and make modifications to it or to provide feedback for improvement as desired. For instance, EEVA currently includes Likert scale as one of its four possible question types (the other three types include open ended, dichotomous, and multiple choice), and the possible responses for the Likert scales are presented in a five-point format. The decision to employ an odd scale as opposed to an even scale and to include five points in total was based on another literature review performed during this project (an odd scale would have an exact middle point that would often take a neutral position between the positive and negative response options versus an even scale would have the same number of positive and negative response options). We believed that the five-point scale format would allow the balance of giving the respondents the opportunity to consider and express different levels of attitude, and at the same time, allow the scale not to be too complex and time-consuming. However, depending on the evaluation scenarios, different evaluation questions and response types could be required. Consequently, the evaluation questions and the corresponding question types could be adjusted at EEVA users' discretion.
Additionally, while EEVA is made available for open, free access and reuse, users and their respective organizations will be responsible for the actual data collected. As a result, before using and adapting the suggested evaluation questions from EEVA, each user should obtain and understand the related privacy, security, and confidentiality policies that might be associated with a specified evaluation scenario. Any applicable policies should also be made clear and available to the potential survey respondents, so that it would be well-understood and agreed regarding the methods in which the respondents' information and responses would be stored, processed, and protected.
Finally, when considering the use of EEVA, it is critical that users also determine the type of analysis and reporting format that might be applied to responses. Having consistent analysis and producing straightforward, easy-to-read statistics or reports for evaluations could enable long term benefits, such as tracking changes in behavior and attitude over time and comparing the difference in effectiveness for various styles of education and training resources. Even though DataONE currently does not provide recommendations regarding the types of analysis or report format that should accompany the evaluation, users should consider these areas and implement their own solutions as they see fit.

Future Work
As we begin to test EEVA's implementation and continue to refine features from the current revision, we have targeted two complementary areas for further investigation.
The first area is additional education and training methods or techniques that the DataONE CEO Working Group could employ to augment the education and training options it currently offers. For example, based on the initial literature review that was conducted for this project, supplementary education and training activities and events, including multimedia-enabled quizzes and exercises, interdisciplinary/multicultural content, and knowledge/skill set checklists, could be integrated with DataONE's existing resources. The evaluations performed using EEVA would provide us with insights into users' education and training preferences, so that we could determine which new methods or techniques should be added. As these new education and training methods or techniques become available, additional evaluations could be completed to refine the methods and techniques. Essentially, the added content could enhance the interactivity of the education and training resources and increase user engagement, therefore potentially improving the resource's overall effectiveness.
While EEVA is currently in the form of a digital survey, other user feedback channels may also be considered by DataONE to broaden communication with users. For instance, social listening, online community/discussion boards/feedback portals/forums, and ranking/reviews all represent the kinds of communication in which DataONE community users might participate. Subsequently, by complementing the feedback provided by survey results, DataONE might be able to observe and receive more natural responses from the users by participating in or facilitating discussions within these communication media.
Ultimately, DataONE users and community feedback are key to the success of the education and training resources, as well as to the management of the eScience data. As a result, DataONE will continue to solicit and review evaluation results and to optimize the effectiveness of data management education and training in order to support eScience for the long term.

Conclusion
As data with the characteristics of 'Big Data' continue to increase in the current eScience landscape, it is also becoming increasingly more crucial for everyone who works and supports scientific research to understand and obtain data management knowledge and skills. Through managing data, not only could individuals minimize the effect of 'data deluge,' but the results of data management could also enable additional benefits, including allowing data to be preserved and accessible for long term use and reuse.
By providing a variety of complementary data management education and training resources, DataONE seeks to help its community members in learning and applying data management techniques as well as in resolving the issues that the members might encounter while working and interacting with data. The effectiveness of the resources would, therefore, be vital in improving the community members' abilities to fulfil their data management needs, and equally important, in supporting the success of DataONE's overall community outreach and engagement effort. doi:10.2218/ijdc.v12i2.508 Chung-Yi Hou et al. | 55 Through the development of DataONE's Education EVAluation (EEVA) tool for the education and training resources, DataONE intends to uphold the effectiveness of its data management education and training resources through administering evaluation surveys and examining the corresponding results systematically. At the same time, by making the tool publicly available to be freely reused by other education and training organizations, DataONE will aim to contribute to the growth of data management, and ultimately, to the advancement of eScience as whole.