Towards a Data Sharing Culture: Recommendations for Leadership from Academic Health Centers

Rebecca Crowley and colleagues propose that academic health centers can and should lead the transition towards a culture of biomedical data sharing.


Benefits of Data Sharing for Academic Health Centers
The benefits of data sharing and reuse have been widely reported. We summarize them here, from the perspective of an AHC.
The predominant benefit of data sharing is accelerated scientific progress. Advances are clearly valuable to an AHC when translated into improved patient outcomes, reduced research costs, and decreased time in moving discoveries from the bench to the bedside.
Of more immediate benefit to AHCs and their researchers, sharing data increases the visibility and relevance of research output. Sharing data generates opportunities for additional publications through collaboration, and may increase the citation rate of primary publications [7]. Since publication history and citation impact are often considered in future funding decisions, these benefits are likely to accelerate research programs, and thus enhance the reputation of the academic institutions.
Data sharing can also benefit an AHC in its roles of educator and employer. Health care professionals trained in clinical informatics [8] benefit from exposure to real-world data. By embracing data sharing goals, an AHC becomes more appealing to cutting-edge researchers [9], and thereby more able to recruit the talent required for future successes.
Finally, the widespread adoption of a data sharing culture needs leaders [10], and thus provides an opportunity for AHCs to demonstrate excellence.

A Leadership Role
Despite the anticipated benefits, sharing research data has yet to be widely adopted in biomedicine [11,12]. Through their interwoven roles in education, research, and policy, AHCs can lead the development of best practices for establishing a data sharing culture. Practical steps with potentially powerful impact are discussed below and summarized in Box 1.
Measure, recognize, and reward data sharing contributions. The lack of recognition incentives is regarded as a crucial and unresolved obstacle to establishing a data sharing culture [13,14]. All research institutions, including AHCs, should develop and track metrics for data sharing contributions as part of their academic research environments. Data sharing contributions should be explicitly considered during hiring, tenure, and promotion decisions [15], perhaps by providing a bonus to a publication's impact factor if the authors have shared the raw research data. Promotion committees should encourage investigators to list their shared datasets on their CVs, in their grant applications, and anywhere they communicate information about their research accomplishments.
Department chairs should encourage their faculty to monitor the purposes for which their data are reused. This would allow investigators to quantify the value of their contribution, as well as personally motivate future sharing [16]. To this end, we encourage the development and general adoption of a data sharing citation index, a concrete metric for tracking the reuse and citation of datasets, as envisioned by the Cancer Biomedical Informatics Grid (caBIG) Data Sharing and Intellectual Capital Workspace and others [17,18].
Integrate data sharing education into curricula and practice. Data sharing must be articulated as a foundational principle of research conduct. Standardized and comprehensive education is likely to be an important factor in decreasing data withholding [11]; data sharing should be included in the curricula of introductory research courses and throughout mentored research. Discussing the ethics of data sharing in clinical and translational research during medical training and graduate research studies can cement a deeper "appreciation that sharing of raw data may lead to techniques or findings or further research that could help alleviate human distress" [19] . Simultaneously, education must appropriately place data sharing within the context of the federal regulations that guard protected health information [20,21] and the ethical obligation to maintain patient privacy by highlighting the distinction between openly sharable scientific data and protected health information.
Addressing these subjects at institution-wide colloquia, as case studies in ethics seminars, or as satellite symposia [22] will provide scientists an opportunity to hear viewpoints they might not otherwise consider. Topics could include the ethical obligation to patients to both maintain privacy and achieve the maximum authorized scientific benefit [19,23,24], the personal struggles felt by investigators when trusting peers to be responsible in data reuse [25], and the impact of reorienting discussions from data ownership to data control [26].
AHCs also play a vital role in educating researchers about the consumer side of the data sharing relationship-responsible data reuse. AHC policies, best-practice guidelines, and guided mentorship can help new trainees take advantage of the enormous opportunities when reusing data while avoiding misappropriation and misinterpretation. Furthermore, understanding the needs and benefits of data reuse will inspire investigators to share their own data with the documentation and annotations that make it most useful for future reuse.
Recommend best-practice mechanisms for data sharing. As biomedical funders begin to require data sharing plans, they often leave the mechanism for data sharing unspecified. Although this choice provides valuable flexibility, the myriad of options can be daunting for investigators. The choice is important: an appropriate mechanism is crucial for effective and rewarding data sharing.
An AHC's office of research can help its investigators choose bestpractice solutions by recommending a framework for evaluating data sharing alternatives. To develop such a framework, IRB (institutional review board) directors, chief privacy and security officers, chief information and technology officers, technology transfer officers, and a wide range of patient advocates and investigators must articulate the trade-offs inherent in various models from the perspectives of privacy, security, intellectual property, scalability, openness, and equity across the complete spectrum of stakeholders [23]. We illustrate three dimensions of these trade-offs in Table 1, and recommend several excellent reviews for further reading [27][28][29].
Fund and maintain infrastructure for data sharing. Education, training, and support are needed again once a scientist has decided to share data. Investigators may appreciate detailed suggestions on what to include in a data sharing plan, such as those provided by the National Institutes of Health (NIH) [30] and caBIG [31]. Mentorship and training through the institution's research office are also crucial when estimating a data sharing budget, since "currently, these costs are chronically underestimated and under-awarded" [32]. This funding is crucial to pay for the process of sharing data.
It is often difficult for investigators to decide where to share types of data that do not have a public, centralized, and well-recognized database. We recommend that research leadership in AHCs support solutions that optimize data persistence, visibility, ease of interpretation and integration, privacy, accountability, and openness. Such solutions could involve participating in data sharing collaborative projects, choosing information technology solutions that facilitate data sharing and provide required access logs, hosting data sources that do not have a more appropriate home, adopting syntactic and semantic standards [33], providing consultation to investigators who need help sharing their research effectively, encouraging participation in professional societies such as the HealthGrid (http://www.healthgrid. org/), or lobbying for national networked infrastructure [34].
Revise policies and guidelines to reflect data sharing goals. We encourage AHCs to recognize the importance of data sharing across the organization, and then take steps to harmonize all relevant policies and guidelines with their data sharing goals. Many of the issues are clear, such as ensuring that data sharing goals are consistent with material transfer agreements, industrial partnerships, intellectual property policies, technology-transfer guidelines, IRB review criteria, and de-identification 2. Recognize data sharing contributions in hiring and promotion decisions, perhaps as a bonus to a publication's impact factor. Use concrete metrics when available.
3. Educate trainees and current investigators on responsible data sharing and reuse practices through class work, mentorship, and professional development. Promote a framework for deciding upon appropriate data sharing mechanisms. 4. Encourage data sharing practices as part of publication policies. Lobby for explicit and enforceable policies in journal and conference instructions, to both authors and peer reviewers. 5. Encourage data sharing plans as part of funding policies. Lobby for appropriate data sharing requirements by funders, and recommend that they assess a proposal's data sharing plan as part of its scientific contribution. 6. Fund the costs of data sharing, support for repositories, adoption of sharing infrastructure and metrics, and research into best practices through federal grants and AHC funds. 7. Publish experiences in data sharing to facilitate the exchange of best practices. tools and policies. Other issues are often overlooked. For example, AHCs need to ensure that data sharing agreements contain appropriate remedies and are enforced whenever investigators are unwilling or unable to fulfill their commitments [35].
Today's spirit of translational research does not stop at the boundaries of the AHC. Departments of physics and computer science have a successful history of data sharing and may be able to provide guidance. Other departments within science, engineering, business, librarianship, and law are addressing the same issues; it may be possible to forge alliances that advance data sharing. Involving key officials at the University level, such as Vice Presidents of Research and university legal counsel, could yield more consistent policies across campus.
Engage national leadership in data sharing decisions. AHCs are actively involved with many members of the biomedical community. Firmly establishing a data sharing culture will require joint efforts between AHCs, funders, publishers, academic societies, industry, legislators, patient advocates, clinicians, and researchers. We recommend that AHC faculty and staff leverage their roles in the community to promote philosophies and policies that facilitate data sharing. This could involve promoting new funding mechanisms to support data sharing and data archiving [32], working with journal editors to raise the level of data sharing deemed appropriate and necessary for publication [5], supporting legislation to encourage privacy-protected data sharing [36], developing standards for appropriate reuse of health care data [26,37], establishing grant review guidelines for evaluating data sharing plans as part of the scientific contribution of a proposal, expanding NIH guidance and support for data sharing across all data types [38], encouraging the study of incentives for team science [39], developing methods to quantify the extent and impact of data sharing and reuse, and finally, encouraging programs and funding that enable investigators to share data with accuracy, accountability, responsibility, and recognition [40]. We further recommend that AHCs publish their experiences in data sharing to facilitate the development of best practices.

Conclusion
We recognize that there are real and perceived impediments to sharing biomedical research data. Some individual donors may have personal interests in privacy and confidentiality that exceed their desire to contribute to new methods of detecting and treating disease. Investigators may restrict access to data to maximize their professional and economic benefit. Academic health centers may view data sharing as a threat to intellectual property, possibly impeding entrepreneurial spin-offs and technology transfers that bring revenue and act as incubators for future research. AHCs may also worry that the data could be used to critique their health care practices rather than advance the research frontier. Industrial sponsorship can hinder plans for sharing data, and the regulatory environment may necessitate stringent oversight to ensure compliance and minimize risk. These issues can and must be addressed as we work to embrace a data sharing culture. The hurdles may not be as high as we think: 99% of senior technology transfer officers at highly funded NIH universities agree that academic scientists should freely share data with other academic scientists after publication [41]. The systems and architectures in Table 1 provide a future vision of research in which data are more universally available and interoperable. Recent initiatives for making research publications freely available [42][43][44][45] demonstrate a political and academic commitment "to help advance science and improve human health" [46] by widely sharing research results.
Academic health centers will benefit by leading the transition towards a culture of biomedical data sharing. More widespread awareness of these benefits can motivate key stakeholders to take concrete steps to enable, inspire, and reward data sharing within and beyond their institutions.