Pinning It Down : Towards a Practical Definition of ‘ Research Data ’ for Creative Arts Institutions

There is a widespread understanding among scientific researchers about what is meant by ‘research data’; however this does not readily translate into a creative context. As part of its engagement with the University of the Arts London (UAL) and via its support for the JISC Managing Research Data Programme, the Digital Curation Centre (DCC) and partners have worked towards an acceptable and practical definition of research data for creative arts institutions. This paper describes the activities carried out to help pin down such a definition, including a literature review, short and extended interviews with researchers, interactions with an academic arts research practitioner, and distillation of the results from a one-day workshop which took place in London in September 2012. International Journal of Digital Curation (2013), 8(2), 99–110. http://dx.doi.org/10.2218/ijdc.v8i2.275 The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. The IJDC is published by UKOLN at the University of Bath and is a publication of the Digital Curation Centre. ISSN: 1746-8256. URL: http://www.ijdc.net/ 100 Pinning It Down doi:10.2218/ijdc.v8i2.275


Introduction
To employ efficient research data management (RDM), practitioners and institutions must first be aware of the characteristics of the research data they are creating, as well as the policies, rules and norms that pertain.The first step in this task is achieving consensus over what research data comprises, and the forms it takes.In the sciences this first step is relatively straightforward and reasonably well described; however this is not always the case in the creative disciplines.As RDM becomes more important across all disciplines -due in no small part to increasingly astringent funder and government requirements and expectations, coupled with a need for institutional ownership, oversight and demonstrable research integrity -it is important that those disciplines which struggle to define research data (and its relationship with other forms of research output) attempt to do so, and to manage it in ways sympathetic to its object and environment.

Context
Disciplinary variations notwithstanding, there is a widespread understanding among scientific researchers about what is meant by 'research data', and research funders and institutions have supported and enabled this via written definitions: 'Research data -that which is collected, observed, or created in a digital form, for purposes of analysing to produce original research results.' (University of Edinburgh, 2011).
'Data, or units of information, which are created in the course of funded or unfunded research, and often arranged or formatted in a such a way as to make them suitable for communication, interpretation, and processing, perhaps by a computer.'(University of Bristol, 2013).
'The data, records, files or other evidence, irrespective of their content or form (e.g. in print, digital, physical or other forms), that comprise research observations, findings or outcomes, including primary materials and analysed data.' (Beitz, Dharmawardena and Searle, 2012).
The terminology employed in such definitions does not readily translate into a creative arts context.A key issue is that of data as fact, while the creative arts place less currency on fact and more on representation and expression.Indeed, a number of influential definitions situate research data as an exclusively science-specific matter: 'All data collected in some way or another in the context of scientific/scholarly research.A distinction can be made between primary data (empirical, observed, measured data) and secondary data.Secondary data is data derived from sources created previously (figures published by the authorities, data assembled previously, archived data, texts, etc).' (Tjalsma and Rombouts, 2011).
In this climate, those working to improve the management and sharing of research data in institutions specialising in the creative arts have an additional task to those working towards the same ends in other disciplines: the translation of scientific research data management concepts into language meaningful to those working in creative arts disciplines.Garrett and Gramstadt (2012) attempt to understand the nature of research data in the visual arts.They found that visual arts research data can be both "tangible and intangible, digital and physical, heterogeneous and infinite, and complex and complicated."Despite efforts in this area, descriptions of research data for specialist arts institutions so far remain equivocal and awkward.
The primary funder of UK arts research, the Arts and Humanities Research Council (AHRC), incorporates data-related requirements in its policies, but stops short of offering a solid definition of what this might encompass, preferring to list a number of possible digital output types: 'The outputs of the research may include, for example, monographs, editions or articles; electronic data, including sound or images; performances, films or broadcasts; or exhibitions.Teaching materials may also be an appropriate outcome from a research project provided that it fulfils the definition above.' (AHRC, n.d).

Process
The Digital Curation Centre (DCC)1 is a centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community.Via the Higher Education Funding Council for England's Universities Modernisation Fund, the Digital Curation Centre is working in close contact with a group of eighteen UK higher educational institutions -ranging from ancient research-intensive universities to newer specialist institutions -in order to examine and enhance their data management capacities, and share lessons with the entire sector.As part of its engagement with UAL -and also via its support for the JISC Managing Research Data Programme, of which KAPTUR3 is a funded project -DCC colleagues are working towards an acceptable and practical definition of research data for creative arts institutions.This definition is first and foremost a definition for UAL, which we anticipate will be usable by other arts-focused institutions, with minor editing.

University of the
At this point it may be pertinent to define the distinction between visual arts and arts institutions.While many arts institutions will have teaching and research in nonart areas -notably subjects like architecture, landscape architecture, product design engineering, sociology, anthropology, and ethnography -designers will also not think of what they do as being 'art', although it is usually both visual and creative.
The definition that the DCC are working towards, and is described here in this paper, is a subset of the policy development work being carried out at UAL, and the supporting procedures and guidance.It aligns with UAL's contribution to the KAPTUR project in that there is a shared link between the project and the engagement, and our intention was to supplement rather than duplicate KAPTUR activities.However, our focus was broader than that of KAPTUR, which is explicitly concerned with the visual arts.This work went beyond the visual arts to seek a definition that everyone (including designers, architects, etc.)* can engage with.Such a definition could enhance the ability of academic principles for good scientific research to be applied also to artistic research and practice.Work in this area must recognise the concerns among some artists that shoehorning artistic production into the principles of scientific research will threaten processes of inspiration, creativity and idiosyncrasy.

Interviews
As part of the institutional engagement work, two sets of interviews were carried out at UAL: a set of 12 short researcher surveys carried out by telephone in June 2012, and a set of 17 extended face-to-face interviews carried out in July 2012, with responses recorded in a Bristol Online Survey.The shorter interviews informed the structure for the longer, more in-depth interviews.
It was clear from the initial telephone interviews that the term 'research data' is not a widely used term across several creative disciplines at UAL.For example, one researcher commented that: "the term doesn't mean a lot."When asked about the research data they produce in their research process, the majority of participants offered answers including publications or exhibitions, and not data per se.
The term was often only understood when specific examples were offered of how the researcher might use research data in their own field, for example in an archive, or with sketchbooks or test results.For researchers who do not think that they use research data the term proved to be a problem and the interviewer, John Murtagh from UAL, was only able to engage researchers fully with the concept when explicit requests had been made for management of the data underpinning research output by funders.
These findings replicated those of 12 other interviews † carried out across four other institutions as part of the KAPTUR project (Gramstadt, 2011).In those interviews the expressions 'documenting the research process' and 'visualisation and documentation' were offered as alternatives to 'research data'.
The interviews confirmed issues around discipline-specific terminology.One approach suggested was that terminology be centred firmly within established research practices, such as research in archives.This might possibly change attitudes towards the use of research data.
The longer interview questions were written with the help of academic researcher Dr Paul Ryan.Opening information read in the interviews aimed to help researchers understand what was implied by data.The wording is attached as Appendix A. This approach set a 'level playing field' so issues around the definition of research data in the arts were significantly less.Some of the initial questions asked were framed around the idea of 'organisational moments' ‡ or 'trigger points for data creation or management activity'.For example:  Is data produced during the clarification of a research question?
 Is data produced during the design of a methodology?If so, what kind of data does this tend to produce?
 Is data produced during the clarification of your position of interpretation?
The concept of trigger points to help researchers understand the processes they undertake and the outputs of those processes is not a new one.A useful example from more traditional disciplines is the Archaeology Data Service (ADS) model of 'preservation intervention points' (Austin, Bateman, Jeffrey and Mitcham, 2011).
Later questions looked at actual approaches taken.For example, when asked where and how do you create your research data, interviewees answers varied from: in situ, in my studio, in a gallery for the location; and by creating artwork (canvas, clay, paper, in performance), using a camera, sketchbook, notebook, computer programme (Photoshop, Frontpage, wiki), through interviews and conversations, visualisations of key findings using software, and using SPSS for the method.When asked what percentage of their research data is digital or physical, the results indicated a fairly high percentage were primarily digital objects (most giving over 70% as an answer).This result may be due to the interviewees who were chosen to be interviewed rather than indicative of a trend.When asked what type of physical data they created, the prominent answers were paper, photographs, models and artworks.
One interesting quote was made regarding whether the interviewee had received remarks about their data: "Although the word data would never be used, one way to gauge this, is the amount of times I am asked to speak publicly about the work and the creative process behind it.I am also approached almost constantly by other younger researchers who want to know more about the creative process." Both sets of interviews established that a significant amount of digital research data is being created by UAL's researchers, yet they often fail to see it as data per se.However, most recognised the importance of the data create to themselves and potentially to others.
As a result of these privileged interactions with practicing academic artists it was agreed that a user-friendly definition of 'research data' would be written specifically for UAL.

Practitioner Discussions
Throughout the UAL institutional engagement process the practitioner's perspective has been provided by lecturer and postdoctoral arts researcher, Dr Paul Ryan.Ryan developed TAG, a semiotic research tool, for his doctoral thesis, titled 'Peirce's Semeiotic and the Implications for AEsthetics in the Visual Arts: a study of the sketchbook and its positions in the hierarchies of making, collecting and exhibiting' (Ryan, 2009).
In this work, Ryan defines what he calls 'data' and 'iconic data'.The two are considered as two ends of a spectrum along which some data may be part, or even half way.
''Data', as we usually mean it, would be 'factual' evidence that is gathered to support a claim or hypothesis; which in 'conventional' research tends to be represented 'symbolically' (in words, sentences, numbers, or conventionally agreed signs).The fuller name for this kind of data, would be 'symbolic data'.' ''Iconic data' would also be 'factual' evidence gathered to support a claim or hypothesis; but it would tend to be represented 'iconically', that is: through resemblance of some kind, often in a single instance or group(s) of similar instances.Typically these will be pictures, noises, movements, performances etc. (perhaps re-presented in images, films, diagrams etc (digitally or analogue).These stand as instances of meaning and are therefore not generally agreed upon in the same way as the symbolic data above.However, they can be relied on as evidence because when The International Journal of Digital Curation Volume 8, Issue 2 | 2013 'experts' (perhaps artists) in the appropriate field consider them, they 'tend' to agree on what they mean; and to what extend they can be relied on as evidence (in research).This last point relates to the pragmatic level of 'truth'; this has particular requirements with regard to what can be considered to be 'factual' (how we would act on such evidence if we believed it to be true)' (Ryan, 2009).
Ryan elucidates that it used to be the case that incorporating the 'position of inquiry' within research was, at worst, unnecessary.Now, even in the hardest of the sciences, it is accepted that the researcher influences data, and data is therefore more valid if the position of interpretation from which it is presented is clarified.This allows the researcher, and their colleagues, to find value in data even if it has been skewed by individual tastes/influences/histories because the position of interpretation can be taken into account.Clarifying the position of inquiry can only succeed in reducing ambiguity or uncertainty.

Workshop
KAPTUR is a JISC MRD project following on from the KeepIt and Kultivate projects, which aims to discover, create and pilot a sectoral model of best practice in the management of research data in the visual arts.It is led by the Visual Arts Data Service (VADS) and undertaken in collaboration with four institutional partners: Glasgow School of Art; Goldsmiths, University of London; University for the Creative Arts; and University of the Arts London.
On Friday, 14 September 2012 the KAPTUR project ran a one-day workshop, entitled 'Managing the Material: Tackling Visual Arts as Research Data' 4 inspired by the DCC and Australian National Data Service (ANDS) how-to guide 'Appraise and Select Research Data for Curation' (Whyte and Wilson, 2010).§ The workshop, which was aimed chiefly at researchers, encouraged attendees to share their thoughts about what constitutes research data in the creative arts.During the day break-out groups were framed around three main themes: 1. Where's the data?Where's the use?

Visual arts materials in practice.
The first session ('Where's the data?Where's the use?'), facilitated by DCC staff, looked at identifying the research data that might be arise out of the research process.Some current definitions of research data were presented and delegates were asked to consider how these might work within visual arts.In a hands-on session, delegates were encouraged to write down examples of research data within visual arts on post-it notes, and then to consider if they fitted the definitions given previously.
The post-it note answers included the following:  Supporting work -storyboards, mood boards, sketchbook pages, notes, architectural models, reflection journals;  Recordings of activities/conversations (video/audio);  Raw data -digital photographs, video recordings, interviews;  Interdisciplinary needs -computer algorithms, interactive physical art, installation, interactive experience of the art work (for neuro-psychology);  Exhibition records, catalogues, preview invitations, correspondence with venue/curators.
Discussion during the activity raised some interesting observations and issues.It became apparent that some types of research data proved more difficult to align with a definition than others.For example, while video footage could be described as 'recorded factual material', once it had been edited and manipulated then it no longer met this criteria.One concept often used within the arts is the idea of 'provocation' i.e. lying (for dramatic effect or for arts sake).Research data in the arts is often not factually correct and occasionally explicitly factually incorrect, at direct odds with established scientific principles.
Additionally, a significant amount of research at UAL is interdisciplinary, which may lead to methodological frictions.For example, work carried out in the DAC research centre (Design Against Crime), a socially responsive practice-led research initiative, requires expertise from engineering, psychology, criminology and design.Interdisciplinary practices demand that the institutional research data management policy and workable definitions cut across the board.So while some argued that a definition of research data might be restrictive, others felt that it could be helpful when carrying out RDM.It was felt that arts subjects can learn a lot from more science-based approaches to RDM.The differences between the scientific and creative methods, relationships between the analogue and the digital, sensitivities towards standard terminology, resistances to definition and issues arising from the unclear separation between professional and personal practice are all aspects of arts research/praxis which set it apart from the standard scientific methodology.Andersson (2009) makes some interesting points about the differences and similarities between conventional scientific research and art research.Andersson comments that "the two fields have a fairly shallow knowledge of each other" and that "scientists are criticized for being too analytical, elitist and objectivistic, and artists are ascribed subjectivity, irrationality and sublime rapture as primary driving forces in their work (thereby discrediting them as researchers, alternatively launching them as champions of "another" knowledge)."Andersson attempts to "break out of the deadlock dynamic around these theoretically and philosophically contagious issues."He suggests that the initial phase (from engagement or interest to actually starting a defined process of production of meaning), which we might call conceptualisation, is similar for arts and scientific disciplines because it is "messy, intuitive, emotional and insecure".He also talks about 'data collection' and how processes again are similar for both types of researchers.However, arts researchers work amethodologically (using empirical methods in a transparent way) and they use references differently (i.e. it is their prerogative to lie and misrepresent).definition that this works well for both types of researchers.The article is in support of the embracing of good research practice by arts researchers although alludes to the reality that performing artistic research in a demystified way may be threatening to many.
Note that Andersson also mentions the idea of provocation: "If we look at modern art and relational aesthetics, we see that artists have used lying, cheating and copying as techniques in their representational work to make their art pieces effective."Additionally, while the sciences present themselves as orderly and systematic, the reality is that there is ample interpretation within the physical sciences.Arguably, this goal of this interpretation is to reduce ambiguity or uncertainty, which would also seem contrary to methods in the arts or humanities.
Allen Renear and Lauren Teffeau have also written about the process of extending data curation to the humanities, mainly as part of the DigCCurr work looking at extending the digital curation curriculum (Renear and Teffeau, 2009).Although they do not focus in on creative arts, their general principles are nonetheless useful.They suggest that best practice development for curation of cultural data has a reciprocal relationship with the curation of scientific data, each informing the other and together advancing data curation as a discipline.
In the closing session of the KAPTUR workshop, discussions considered whether a definition can comprise solely of examples.The Oxford Dictionary states that a definition should be "a statement of the exact meaning of a word, especially in a dictionary; an exact statement or description of the nature, scope, or meaning of something; the degree of distinctness in outline of an object, image, or sound."The implication is that there will be description involved.However, many definitions do give examples and given that the aim is to 'make clear and distinct' such an approach may be necessary.

Conclusion
In late 2012 the UAL research data management policy was approved by the Research Standards and Development Committee (RSDC) committee.The policy applies to all UAL staff involved in externally funded research, especially where the funding body requires a data management plan.Its primary application is to existing, live awards and future funded research, and within the definition of scope the following wording is included: 'Research data in the Arts is not so easily defined as in STEM subjects.The data types cited in this policy are not intended to be exhaustive, and definitions of what constitutes research data will vary from funder to funder.Generally, research data can be considered anything created, captured or collected as an output of funded research work in its original state.
As an example, the Arts and Humanities Research Council (AHRC) says "The outputs of the research may include, for example, [...]  In essence, this policy covers raw materials and finished outputs, but not necessarily the stages in between.It applies primarily to externally funded, digital research data, although non-digital data (such as sketchbooks) may also be covered, and requests from researchers to digitise existing analogue research data will be considered on a case-by-case basis.Where data exists in a nondigital form, appropriate effort to manage this to meet the expectations is also likely to be required.No reasonable external request to access analogue research data resulting from externally funded research will be refused, and access should be arranged between the principal investigator and the department of Research Management and Administration (RMA).'5 The definition remains loose and will likely be refined over time as challenging 'edge cases' emerge.However, as a practical definition it serves a purpose, and the key for UAL now is supporting the implementation of the policy.The definition is one tool in a set required to do this.
The operation of the policy will be supported by a research data management 'toolkit' (including guidance, a research data management plan checklist, an audit methodology and tool), and a programme of training for researchers.These are in development as part of the UAL institutional engagement and will be in place by May 2013.
The research process towards a practical definition has emanated in many useful discussions.It is clear that creative arts institutions need a definitions of research data to help them move towards better research data management practices yet the 'art is different' mind set is often one difficult to overcome.Andersson (2009) notes that: 'Since the art world gets its alluring qualities to a large degree from the mysteriousness of the auteur and his/her fantastic life and surroundings, performing artistic research in this demystified way may be threatening both to artists' self-conception as well as to the art market.' Those involved in research data management working at arts institutions or within creative disciplines will have an important role to play in supporting researchers and demonstrating that research data management is an positive, enabling activity rather than a threatening and unwelcome one.

The
Andersson concludes his article by suggesting that research is about as moving towards "new knowledge and meaning", a

The International Journal of Digital Curation Volume 8, Issue 2 | 2013
electronic data, including sound or images; performances, films or broadcasts."