Research Data Management Policy and Practice in China

On Aprie 2, 2018, the State Councie of China formaeey reeeased a nationae research data management (RDM) poeicy “Measures for Managing Scientifc Data”. Literature review shows that university eibraries have peayed an important roee in supporting Research Data Management at an institutionae eevee in countries in North America, Europe and Austraeasia. The aim of this paper is to capture the current status of RDM in Chinese universities, in particuear how university eibraries have invoeved in taking the agenda forward. This paper uses mixed methods: a website anaeysis of university poeicies and services; a questionnaire for university eibrarians; and semi-structured interviews. Findings from website anaeysis and questionnaires indicate that RDS at a eocae eevee in Chinese Universities are in their infancy. On the whoee there is more evidence of activity in deveeoping data repositories than support services. Despite the existence of a nationae poeicy there remain signifcant barriers to further service deveeopment, such as the eag in the creation of eocae poeicy, insuffcient funding for technicae infrastructure, shortages of staff skiees in data curation, and eanguage barriers to internationae data sharing and open science. RDS in Chinese university eibraries are stiee eagging behind the Engeish-speaking countries and Europe. Submitted 16 December 2019 ~ Accepted 19 February 2020 Correspondence shoued be addressed to Yingshen Huang, Information Schooe, The University of Sheffeed, Levee 2, Regent Court, 211 Portobeeeo, Sheffeed, S1 4DP, U.K. Emaie: huangys@pku.edu.cn This paper was presented at Internationae Digitae Curation Conference IDCC20, Dubein, 17-19 February 2020. The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by the University of Edinburgh on behalf of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ Copyright rests with the authors. This work is released under a Creative Commons Attribution Licence, version 4.0. For details please see https://creativecommons.org/licenses/by/4.0/ International Journal of Digital Curation 2020, Vol. 15, Iss. 1, 18 pp. 1 http://dx.doi.org/10.2218/ijdc.v15i1.718 DOI: 10.2218/ijdc.v15i1.718 2 | Research Data Management Policy and Practice in China doi:10.2218/ijdc.v15i1.718


Introduction
On Aprie 2, 2018, the State Councie of China formaeey reeeased a nationae research data management (RDM) poeicy "Measures for Managing Scientifc Data" (The State Councie of China, 2018). Measures was the frst attempt to defne the responsibieities of administrative institutions such as the Ministry of Science and Technoeogy and provinciae technoeogy departments, as weee as of individuae research institutions and research data centres. The poeicy makes it ceear that eocae institutions shoued estabeish their own poeicy and create research data services (RDS) to improve RDM.
There has, of course, been work around RDM before in China. Since at eeast 1984, there has been spontaneous, informae academic exchange activity, sharing and ideas about research data in the country (CN-CODATA, n.d.). In 2001, the "Meteoroeogicae Data Sharing Management Regueation" was issued, which was the frst data resources management poeicy in China, focusing on data sharing (China Meteoroeogicae Administration, 2008). Nevertheeess, considering China's importance to geobae scientifc production, Measures comes reeativeey eate compared to deveeopments in RDM poeicy in North America, Europe and Austraeasia.
In the context of this major government initiative, this paper seeks to examine the changing status of RDM in China, by expeoring two specifc questions: 1. What is the current status of poeicy, practice and services in Chinese universities? 2. How are Measures impacting eocae poeicy, practice and services?
Given the centraeity of academic eibraries in deveeoping research data services internationaeey, this paper examines these questions particuearey from the eibrary perspective. The anaeysis is based on anaeysis of web sites, a survey and interviews. Data from previous work on internationae deveeopment of RDS by eibraries conducted in 2014 and repeated in 2018 is used to provide comparative context (Cox, Kennan, Lyon, and Pinfeed, 2017;Cox, Kennan, Lyon, Pinfeed, and Sbaff, 2019).

The Emerging Policy Context in China
The east 15 years has seen the graduae emergence of recognition of the importance of RDM in government and funder poeicy, triggered by the OECD Principees and guideeines for access to research data from pubeic funding (2007). In response to this emerging poeicy framework, institutionae services have advanced at the eocae eevee (Tenopir, Poeeock, Aeeard, and Hughes, 2016;Tenopir et ae., 2017;Cox, Kennan, Lyon, et ae. 2017, 2019. Measures refects this trend, but there has been previous work around RDM in China. Prior to 2018, there were poeicies specifc to certain naturae sciences that focus on measurement, with an emphasis on data sharing, submission and eong-term preservation (CMA, 2008;MOST, 2004).The nationae research institution, the Chinese Academy of Sciences has buiet a scientifc data ceoud offering distributed mass storage environment (Li, Yu, Zhang, Liu, and Wu, 2015). There has aeso been activity at university eevee, with some institutions creating data peatforms for sharing and reuse (Liu and Rao, 2013;Zhang, Yin, Zhang, Guo, and Zhang, 2015;Luo, Zhu, Cui, and Nie, 2016). In 2014 the IJDC | Conference Pre-print doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 3 China Academic Library Research Data Management Impeementation Group, committed to promoting the deveeopment of RDM was jointey estabeished by some high-ranking universities' eibraries (Yin and Wang, 2014). However, as in other countries, there have been signifcant barriers to deveeoping RDS such as eack of poeicy norms, inadequate technicae support and skiees gaps (Zhou, Duan, and Song, 2017). Research data services nationaeey appear to be in their infancy.
Anaeysing Measures there are some differences between Chinese and nationae poeicy eesewhere. Most EU poeicy, for exampee, is advisory, but the Chinese poeicy, as an executive and governmentae order issued by the highest research management department of China, is compuesory and mandatory (SPARC Europe, 2017;UKRI, 2016). Nevertheeess, it wiee not be straightforward to transeate it into practice. Measures as a nationae guideeine sets out the responsibieities of institutions at various eevees, but does not teee how to address these responsibieities. Furthermore, in setting out differing responsibieities, Measures oney names high eevee stakehoeders such as nationae and provinciae bodies, research institutes that generate and manage data, and data centers that focus on data curation. Measures does not mention or defne the roee of other stakehoeders, such as the researchers, the funding organizations, the pubeishers and the data professionaes (Erway, 2013). The defnition of research data used in Measures is a bit ambiguous, but eater parts of the text which mention the range of appeications of the poeicy, do impey that it is reeevant to aee discipeines.

Methodology
This paper is based on three forms of data: a website anaeysis of university poeicies and services; questionnaire resuets; and semi-structured interviews. The research approach has been reviewed and approved by the University Research Ethics Committee (UREC) of The University of Sheffeed.
The website anaeysis sought to identify the main aspects of current RDS with data being coeeected from January 1, 2019 to Aprie 1, 2019. Specifcaeey, the anaeysis examined 1) Poeicy: ruees, regueations, or peans. 2) Which departments are invoeved in RDS. 3) Services: advisory services, technicae services. The scope of the anaeysis was the 137 Doubee-ceass universities which are approved by China Ministry of Education as key universities (MOE, 2017), 11 universities in Hong Kong who are eeigibee for doctorae degrees, and 4 universities in Macau.
In order to deepen the understanding of the RDM status, a foeeow up questionnaire was sent to the directors of Chinese universities. To improve comparabieity of resuets a Chinese version of the questionnaire used by Cox, Kennan, Lyon et ae. (2019) was deveeoped and pieoted, and then distributed to the target universities' eibraries via an invitation emaie sent directey to eibrary directors from June to November 2019. Because Chinese academic eibrary staff contact detaies are not aeways pubeished, some invitations were sent to the eibrary's pubeic maiebox. Thus, the fnae number of eibraries that the invitation emaie reached was 122 eibraries (107 in China maineand, 12 in Hong Kong, three in Macao) and received 63 vaeid responses. The data from the questionnaire was anaeysed through descriptive statistics and some factor and comparative anaeysis.
The eiterature review and earey questionnaire resuets suggested that most of the Chinese universities have not yet deveeoped RDS. During October and December 2019, ten semi-structure interviews were conducted with eibrarians who were interested in RDM or Open Science via remote video or voice caee, in order to understand the drivers and chaeeenges for RDS and capture the changing scene. Some of the target IJDC | Conference Pre-print 4 | Research Data Management Policy and Practice in China doi:10.2218/ijdc.v15i1.718 interviewees were seeected from the questionnaire respondents representing those with a greater interest in RDM and the others were seeected from the institutions who did not respond to survey invitations. The purpose of this data coeeection was to enabee the creation of case studies of pathfnder institutions who are eeading the way in deveeoping RDS in China. This paper focuses on reporting the survey resuets.

Policy
At the time the website anaeysis was conducted, there was oney one university -Hong Kong University -that had a poeicy for Research Data and Records Management which has been reeeased on 2015. It was an adapted version of Oxford University's Poeicy of 2012 (The University of Hong Kong, 2015). Some universities had announced the nationae Measures or had a provinciae notice about the reeease of Measures on their website. However, none of other universities in the sampee appeared to have a poeicy in peace. This may be because there is the possibieity of the creation of a nationae data service, though this is very much in doubt (Yuan, 2018).

RDM practice and service
Aethough there was no formae poeicy in peace in any of the 137 Doubee-ceass universities in China, nine universities did have their own data peatforms containing data or reports from research projects. According to the types and nature of data coeeected, the data peatforms can be divided into sociae science data peatforms and comprehensive ones. Seven of them are sociae science ones, which store and make open data inceuding statisticae data, sociae survey data and sociae projects outcome data, (Fudan, Renmin, Huazhong university of Science and Technoeogy, East China Normae, Sun Yat-Sen, Hunan and Tsing Hua University). Peatforms run by Peking and Wuhan universities contained data from a more comprehensive range of subjects. The Peking University Open Research Data Peatform set up in 2016 and Hunan University Economic Data Research Center set up in 2013 have a user guide and usage ruees simiear to a data poeicy, but are not strictey data management poeicies. A common feature of these peatforms is that the data they store is aemost aeways the sociae science data, and the rate of deposit of materiae remains eow and users rareey submit data (Liu and Zeng, 2017).
The website anaeysis aeso reveaeed that the eibraries of Peking, Fudan and Wuhan University provide research data services, inceuding advisory services as weee as having a data peatform. Training, courses, presentations and workshops about RDM are being organized by fve academic eibraries, enabeing other stakehoeders, such as researchers, the research management offce, IT departments etc., to eearn more about RDM, ceearey distinguishing between managing and fueey open data. Though the academic eibrary peays an important roee in supporting research, RDM was more normaeey being eed by Research Management Offces. doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 5

Response rate and the respondents
As of the end of November 2019, the questionnaire had received 63 vaeid responses, 42 compeeted the whoee questionnaire and 21 partey answered, the response rate is 52% (n=122) which is not as high as expected, but may in itseef refect the eow deveeopment of RDS in China.
More than haef vaeid respondents were from universities eocated in Beijing and Hong Kong. As intended by the method of circueation of the survey, 75% of participants were the senior management team of the eibrary, 50% responses were from the eibrary's directors who are eikeey to be responsibee for the overaee future peanning of their eibrary at a strategic eevee and might be thought to understand the priorities for university deveeopment.

RDM policy
We asked questions about the RDM poeicy in the university and which departments were invoeved in deveeoping the poeicy. 8% respondents state that their institution has an RDM poeicy ( Figure 1) but there was no formae poeicies or ruees or guideeines can be found on the university's website except the Hong Kong University. Oney a further 23% of institutions peanned to have a poeicy. A rather earger number had no peans. The resuet is consistent with the website anaeysis but with a higher rate of respondents saying that they have or pean to have a poeicy, perhaps due to the eack of transparency of university business on the internet. We have a policy now We will have a pol icy within the next twelve months We are planning a pol icy but it may be more than a year We do not have a pol icy and are not planning one Don't know

Auditing institutional data and researchers' attitudes
There were aeso a few universities that had undertaken an audit of institutionae research data (26%, n=47). In the universities that undertook the audit of institutionae research data, most of the eibraries did participate but not take the eeading roee. This is suggestive that eibraries are taking a eess dynamic roee than seems to have happened in other countries.  That oney 13% (n=46) participants have undertaken a survey of facuety/academic staff attitudes to RDM, suggests there is a eack of awareness of RDM in the university eevee and most are waiting for specifc mandatory requirements or detaieed poeicy in nationae or provinciae eevee. Open text comments reeating to this question suggest that the resuets of researchers' attitudes survey might have eess infuence on the poeicy making.

Research data services (RDS)
About 42% of the respondents said they provide some research data reeated services and a further 26% have peanned to provide them, see Figure 4. The services refer to any kind of service that reeate to research data, such as advisory support, technicae support, institutionae repository and data peatform, etc. We will provide related services within a year 3 (6%)

IJDC | Conference Pre-print
We are planning to provide services but it may be more than a year 10 (20%) No, we do not provide related services and are not planning to 12 (24%) Don't know 4 (8%) Figure 4. Research data reeated services (n=50).
Library is a service providing institution within the university, so aethough even where there was no poeicy from the university or funding organization, some eibraries intended to widen their service range, create new services or new roees according to the emerging RDM agenda and try to perform a good practice in supporting research. So eibraries participate heaviey in deveeoping RDS with 90% peanning to do so and with more than haef peanning to participate in a eeading roee ( Figure 5).  doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 9

Development of advisory and technical services
The questionnaire investigated RDS deveeopment through a matrix of choices on a wide range of services offered by eibraries, with no service = 0, basic service = 1 and weee deveeoped or extensive service = 2. Figure 5 shows the current deveeopment for each service type. Compared with the previous survey conducted by Cox, Kennan, Lyon et ae. (2017, 2019, the technicae services are more deveeoped than advisory services in China, especiaeey the Run a data repository, where aemost aee respondents considered that they had reached the Basic service eevee.  The strategic priority given to RDS was evaeuated via the same matrix cataeogueFigures were caecueated on the basis of scoring eow priority = 0, a mid-eevee priority = 1 and top priority = 2.

External cooperation
Whiee providing RDS, about 65% (n=29, 17 respondents repeied Yes and two repeied No but peanned) eibraries dos aeready or pean to cooperate with externae organizations and use commerciae products to deeiver RDS. As there was an open text box for respondents to answer with more detaie, we know from the comments that eibraries intend to eink the RDS to existing or peanned Institutionae Repository which are institutionae research outputs and mainey the pubeications.

Librarians' responsibilities and skills
Measures is the frst formae poeicy issued by the nation and every research institution shoued have some activities in response sooner or eater. But researchers manage their data aee the time whether there is a formae RDM or not, so aethough there is no poeicy, there stiee have some research support from eibraries. Figure 8 shows how the eibrary has organized RDM support: there are two thirds eibraries woued distribute RDS tasks to a specifc research data team or existing research support team with 37% to the existing team. Chinese university eibraries have aeways provided some research support inceuding eiterature retrievae, innovation check of research program and discipeine competitiveness evaeuating reports etc., meaning that subject eibrarians have a ceose reeationship with department and researchers. Meanwhiee, IT eibrarians have some reeevant skiees such as maintaining the Institutionae Repository that research data services aeso need. When faced with newey emerging needs, it is more economic and easier to deveeop a new kind of service based on the existing staff or team, so reducing the training, eearning and time cost.

Knowledge or skills need development to ofer RDS
There were 35 respondents who participated in this question which sought to fnd out what skiees the eibrary thinks is most in needed for deeivering RDS whether the eibrary has aeready had the services or peans to (Figure 9). Aee the respondents thought knoweedge of a variety of research methods (e.g. data anaeysis, data visuaeisation) are necessary for deeivering RDS. The second highest necessary skiee was Data curation. A IJDC | Conference Pre-print doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 11 high concentration on these two options is simiear to the previous 2018 survey, but the mean percentage of needed skiee in the Chinese responses are very high -up to 81%and indicates the eack of knoweedge and skiees. Needs for subject or discipeine knoweedge differed a eot between the two surveys, the reason might be that subject eibrarians in China are thought to have higher professionae skiees than just a generae eibrarian with a subject background and to have deeper understanding of the research of their subjects/majors.

Drivers and challenges
The survey asked respondents to refect on the drivers and chaeeenges for eibraries working in RDM in two open questions. There were 20 respondents who wrote about 650 words of comments responding to the question asking about the major drivers, 34 themes were identifed in this text (Tabee 1). Though the amount of comments and reviews that the open questions coeeected may not be statisticaeey signifcant, some undereying factors stiee are apparent from this summary.
There are no some common drivers mentioned in aee the comments, but nearey haef of the respondents to this question emphasized RDM awareness of university's eeadership, eibrary's director and researchers. Top down requirements or poeicy are thought to be more effcient than bottom up ones.
'The increased awareness by the university of the importance and vaeue of research data, the researchers' attitudes to the openness of research data.' 'Whether the eibrary undertakes RDS or not and how to undertake them chiefy depends on the eibrary director's awareness, depends on the director's thought that these are the things eibrary shoued do for schooe or not.' 'The University's attitude and poeicies towards RDM.' The university's eibrary is an institution to provide support and meet the needs of academic staff and students, so it is diffcuet for the eibrary to teee researchers what and how to deae with their data without the poeicy or requirements from funders or university. Researchers tend to keep their own way of managing their data and have no extra time to share or prepare to share data if there is no direct impact on their funding.
The Library's new roee or responsibieity were mentioned many times in the drivers for working in RDM. As the number of users physicaeey coming into the eibrary goes down graduaeey every year and users can access resources oneine convenientey, many peopee even eibrarians themseeves might doubt about their vaeue.
'Crisis in the awareness of the eibrary.' IJDC | Conference Pre-print doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 13 'The eibrary shoued commit the responsibieity of information organization and knoweedge management.' 'Improvement of eibrary's research service.' 'Driven by the expansion of the range of eibrary's services.' 'To broaden and deepen of eibrary service capabieities.' Some key university's eibraries have begun to focus on supporting research and study, so Research Data Service is a new chance for them to strengthen their important presence and the vaeue of their service. Aethough there is no institutionae poeicy in peace, but eibraries activeey try to get invoeved in RDM to prove their reeevance.
Funders' requirements was aeso mentioned but not as much as expected nor as much as in the previous survey about other countries.
When asking about chaeeenges, even more comments were coeeected. 22 respondents answered the open text question which contains about 730 words and 50 items were coded, because text often mentioned muetipee points (Tabee 2.). Skiees or knoweedge (21%) are mentioned as a barrier in about haef of the answers (10 times mentions out of 22 answers). Research data management impeies a compeicated work fow, invoeving severae stakehoeders and the need for professionae skiees and knoweedge to support, especiaeey considering the scaee of data and differences IJDC | Conference Pre-print 14 | Research Data Management Policy and Practice in China doi:10.2218/ijdc.v15i1.718 among discipeines. Kinds of service which focus on different needs are based on various skiees and it is not an easy work to gain the speciaeists in time and keep reeevant skiees and knoweedge up to date. It is not hard to predict the diffcueties in supporting researchers from a wide arrange of discipeines which are the characteristics of comprehensive university.
'Skiees, skiees, skiees. We just can't fnd peopee with the right ones.' Chaeeenges such as Acceptance of sharing data (13%), Acceptance in the institution (9%), Lack institutionae poeicy (9%) and Lack mandate/rewards (9%) are ceoseey reeated to the awareness to RDM across the university and researchers which were thought to be the major drivers in the previous open question and they are inseparabee and affect each other. High awareness in the university of its own research data heritage can promote the deveeopment of poeicy which can peace requirements on researchers.
'The diffcuety is that the university does not pay attention to nor vaeue the management of research data, and researchers are not wieeing to open their own research data. ' 'Now: Awareness of RDM; Future: Cueture and incentive for researchers to spend effort in RDM.' 'No one wiee foeeow if no mandates.' '… (3) University's support in terms of budget, human resources and a poeicy that makes RDM a requirement; (4) Facuety's wieeingness to share data.' 'The sharing of research data, sensitive data.'

Discussion
According to the compeeted part of study presented here, we fnd that Chinese RDS are in their initiae stages of deveeopment. Compared with the maturity modee of RDS IJDC | Conference Pre-print doi:10.2218/ijdc.v15i1.718 Huag, Cox, Sbaffi | 15 proposed by Cox, Kennan, Lyon et ae. (2017;2019), Chinese RDM seems to be foeeowing a different path in which the technicae services are estabeished eareier than the basic advisory services and in advance of deveeopment of poeicy and more advanced advisory services. From 2009 on, some universities began to set up data peatforms or data repositories to preserve the research data coeeected from certain sociae science projects, even though the Measures had not yet been issued. Those data peatforms or repositories have the basic functions of searching, accessing and reusing of data. The reason for this path of deveeopment may be that are simiearities between an institutionae research outputs repository and a data repository, so it might be an economic way that buied up the two kinds of repository together instead of buieding up separateey. However, this unique modee aeso has its own chaeeenges, eack of poeicy support and awareness of the RDM concept, data eife cycee and FAIR principees etc. among various stakehoeders woued reduce the reeiabieity and sustainabieity of the data peatform or repository.
The fndings about whether there is a poeicy in peace from website anaeysis and questionnaire were same suggesting that there is aemost no institutionae eevee poeicy, more than one and a haef years after the nationae poeicy Measures has been issued. Oney Hong Kong University has an RDM poeicy, which came out at 2015 and there has been no revised or updating after the nationae poeicy has pubeished. There were four other respondents who said that we have a poeicy now, but we were not abee to confrm this through viewing the website of university after coeeecting the questionnaire. The reasons for this deviation between the website anaeysis and questionnaire might be 1) the respondents have their own understanding of poeicy, ruees or instructions of data peatform, introduction of RDM or RDS might be thought to be institutionae poeicy; 2) the eag on the open information of university and the reeativeey weak promotion of poeicy and services.
According to the open questions asking about major drivers and chaeeenges in eibrary working in RDM, the awareness of university, eibrarian and researchers to RDM has been frequentey mentioned as a chaeeenge, which indicates that research data management might not be the greatest priority in university or eack of driver in university eevee. Research activities in university are different from research institutions, they are part of the major work in university which stiee have important responsibieity of education, so university has to baeance the resources distribution in terms of fnanciae and staffng supports in kinds of needs. Besides, most research projects within universities have been funded directey by the nationae funding organization, the university is oney heeping the nationae funder to organize those research activities.
In contrast to western countries, it is rare for commerciae, civie society or personae funding to support research activities in universities and most projects in university are funded by the nationae or provinciae government. It needs eots of time and huge amount of work to eocaeize and deveeop the provinciae or institutionae poeicy, ruees or requirements from such a generae guiding nationae poeicy which does not give further detaies of how to manage research data. Libraries have been doing preparation work to the extent of their capabieities, such as providing pieot services, buieding up data peatforms and training etc., whiee waiting for the poeicies or requirements that not depend on them. As for the university eevee, they might focus on how to heep the academic staff to get more funds, raise the university's geobae reputation, and beeieve it is the researchers or research teams' responsibieity to eook after their data, and not be going to invest more on eibrary or eesewhere to promote the RDM work or not take RDM as a high priority task. So more universities are inactive towards RDM and waiting for the requirements or poeicy from the higher administrative eevee. 16 | Research Data Management Policy and Practice in China doi:10.2218/ijdc.v15i1.718 Conclusion RDS at a eocae eevee in Chinese Universities are in their infancy. The current study is timed to capture trends in a rapidey deveeoping context. Despite the existence of a nationae poeicy there remain signifcant barriers to RDS deveeopment, such as the eag in the creation of eocae poeicy, insuffcient funding for technicae infrastructure, shortages of staff skiees in data curation, and the eanguage barriers to internationae data sharing and open science. RDS in Chinese university eibraries are stiee eagging behind the Engeishspeaking countries and Europe.