Generating ‘ good enough ’ evidence for co-production

Co-production is not a new concept but it is one with renewed prominence and reach in contemporary policy discourse. It refers to joint working between people or groups who have traditionally been separated into categories of user and producer. The article focuses on the coproduction of public services, offering theory-based and knowledge-based routes to evidencing co-production. It cites a range of ‘good enough’ methodologies which community organisations 
and small-scale service providers experimenting with co-production can use to assess the potential contribution, including appreciative inquiry, peer-to-peer learning and data sharing. These approaches have the potential to foster innovation and scale-out experimentation.


Introduction
Co-production is not a new concept but it is one with renewed prominence and reach in contemporary policy discourse.It refers to joint working between people or groups who have traditionally been separated into categories of user and producer.Co-production is most commonly deployed in the context of public service delivery, although it has also been used to refer to producer / consumer collaborations in relation to research, policy and commercial services (Verschuere et al, 2012;Beebeejaun et al, 2013;2014;Durose et al, 2013;Durose and Richardson, 2015).
The focus here is on the co-production of public services, which aims to harness the insights which occur through closer working between people using and delivering services.This type of co-production is often associated with the work of Nobel Prize winner for Economics, Elinor Ostrom, who used the term to describe a process by which 'inputs from individuals who are not "in" the same organisation are transformed into goods and services ' (1996, 1073).It is defined by Sharp as 'the recognition that public services are the joint product of the activities of both citizens and government officials ' (1980, 110).
This article begins by discussing the relative weakness of the evidence base surrounding co-production, and offering two explanations for this: first, the breadth of the term and its lack of programmatic focus; and second, the shifting parameters the political spectrum has fuelled a sense of conceptual ambiguity surrounding coproduction, perhaps reinforcing its appeal (Vershuere et al, 2012).
What is notable for debates on evidence-based policy making is that co-production has been granted an influential role in the future of public services and indeed public governance on the basis of little formal evidence.It is used to signify and denote both a range of policy objectives and the means of achieving them (Durose and Richardson, 2015).A number of reviews that the authors have been involved with in the UK (a policy review for the Arts and Humanities Research Council (AHRC) Connected Communities programme (Durose et al, 2013) and research guides for the Social Care Institute for Excellence (SCIE) (Needham and Carr, 2009;SCIE, 2013) have highlighted the limits of the evidence base for co-production.These metareviews examined existing evaluations of co-production case studies and found few evaluations that would be placed at the top of traditional evidence hierarchies (for example, controlled studies or systematic reviews).Single case studies were widely cited despite a lack of independent evaluation.Approaches such as Social Return on Investment (NEF, 2010) were published without always providing publicly-accessible evaluation methodologies.The reviews also found that many of the existing case studies were published by organisations with a pre-existing commitment to working co-productively, giving a normative and reifying feel to the findings.There is also a lack of longitudinal evaluation: cases offer a snapshot of success rather than an account of sustained organisational change.
Similar patterns have been found in international comparative work on coproduction.Brandsen et al acknowledge the reliance on case studies in co-production evaluations: '[t]he debate would benefit from greater methodological diversity (specifically, more quantitative comparative work) and yet further conceptual clarification ' (2012, 387).Comparative evidence (either in terms of comparing across sites of co-production, types of services or outcomes or comparing co-production with more 'traditional' approaches to local public service provision) is limited (Verschuere et al, 2012;Durose et al, 2013).The economic case for co-production in particular is hard to sustain on the current evidence base.The AHRC policy review concluded: 'The case for co-production is often made in terms of its "strong potential relationship to efficiency" (Ostrom, 1993, 231) but there are limits to existing evidence' (Durose et al, 2013, 11).
A second factor to consider when exploring the apparent mismatch between the reach and evidence base of co-production is that the political context is shifting in terms of what is meant by evidence -and what counts as good or appropriate evidence.Recent UK governments have made an explicit commitment to evidence-based policy making (EBPM) as part of a stronger 'delivery' agenda (Sullivan, 2011).There is much scepticism about the extent to which policies and politicians are any more evidencebased than they were in the past (Wells, 2007;Meager, 2010).However the discourse of EBPM has created a shift within the evaluation communities of government about what kinds of evidence are admissible, with a more formal privileging of positivist empiricism than was evident in the past (Rhodes, 2011;Sullivan, 2011).It can be argued that the demand for more rigorous 'scientific' forms of evaluation, and the establishment of 'what works' in terms of the success of narrow policy interventions, has led to a corresponding scepticism towards qualitative research focusing on limited areas or assessment of case studies designed to explore 'how it works' (Sullivan, 2011;HM Government, 2013).
Co-production has a relational dimension which does not easily fit this evaluation context.In undertaking the co-production reviews for AHRC and SCIE, which involved engagement with policy actors outside academia in the fieldwork and dissemination phases, practitioners argued that co-production is most likely to grow through spreading ideas and innovation to local peers and developing locally appropriate practice in ways that reflect citizen preferences for 'small-scale, informal activities' (Richardson, 2011, 5 cited in Durose et al, 2013, 32), enabling local innovation to flourish (Bunt and Harris, 2010;Berry;2012;O'Donovan and Rubbra, 2012;Porter, 2012;Bovaird, 2013).However, practitioners also felt under pressure from government commissioners (locally and nationally) to evidence the benefits of coproduction using formal evaluation tools which did not fit this informal, local context.
Accepting this polarity -between a rigour borrowed from the natural sciences and informal local knowledge -suggests that co-production is destined to continue as an under-evidenced approach to public service reform.However, it is possible to identify approaches which provide a basis for evidencing the contribution of co-productive approaches without embracing positivist empiricism.

'Good enough' methodologies
A first step to building the evidence base for co-production is to utilise theory-based approaches to evaluation which make clear what it is that co-production is supposed to offer and sidestep its definitional ambiguities.The second step is to explicitly include the insights of people working within public services as a form of knowledgebased practice drawn from proximity and familiarity, rather than leaving this as an implicit part of evaluation which can be dismissed as excessively normative.Together these approaches create scope to gather 'good enough' evidence which community organisations and small-scale service providers experimenting with co-production can use to assess its contribution.Three methods are suggested here: appreciative inquiry, peer-to-peer learning and data sharing.

Articulate a theory of change
Theories of change-type evaluations have been extensively used by evaluators in the last decade (Pawson and Tilley, 1997;Fulbright-Anderson et al, 1998).Whilst these evaluations have their limitations (Sullivan, 2011;Powell, 2011), they do provide case study-based evaluations with theoretical accounts of how the intervention is expected to work, against which evaluation findings can be compared.As Glasby comments, 'essentially this [approach] means articulating a clear hypothesis about how and why a policy is meant to work [which] can then be used as a basis for evaluating the success or otherwise of the subsequent policy' (2011,93; see also Pawson, 2006).
Much of the theorisation around co-production has been of the who / what / when / how type (Bovaird, 2007;Needham and Carr, 2009;Verschuere et al, 2012;Osborne and Strokosch, 2013), and it is less common to find accounts of why it is that co-production is expected to produce its espoused benefits.The work of one of the earliest theorists of co-production, Elinor Ostrom, remains a helpful guide to the theory of change which underpins co-productive approaches.As she observed, 'Co-production is not, of course, universally advantageous.Nor is it a process that will occur spontaneously simply because substantial benefits could be achieved' (1996,1082).Ostrom outlined a series of conditions which 'heighten the probability that co-production is an improvement over regular government production or citizen production alone ' (1996, 1082).These conditions, validated in other studies (Brandsen and Helderman, 2012;Verschuere et al, 2012), offer a possible benchmark against which co-production approaches can be evaluated.
Ostrom's first condition is produced as follows: '… when co-productive inputs are diverse entities and complements, synergy can occur.Each has something the other needs… ' (1996, 1082, 1079).So, co-production produces efficiencies by bringing together diverse forms of expertise, resource and assets in new and creative ways.Ostrom's second condition is that there must be flexibility for participants: 'options must be available to both parties ' (1996, 1082).This condition warns of the homogeneity of centralisation and against 'organisational fixes' where a particular design is valorised as having intrinsic advantages (Durose et al, 2013, 17).The third condition that Ostrom sets out is that 'participants need to be able to build a credible commitment to one another so that if one side increases input, the other will continue at the same or higher levels ' (1996, 1082).This condition asserts that benefits are accrued through co-production due to transparent, accountable relationships between the participants.Ostrom's fourth condition is that incentives are used to 'help to encourage inputs from both officials and citizens ' (1996, 1082).Such incentives 'may be little more than the opportunity for officials to get to know citizens and vice-versa in an open and regular forum ' (1996, 1082).Together these four conditions explicate a theory of change for co-production, 'opening up the '"black box" between programme inputs and outputs' (Sullivan, 2011, 503).They allow tailoring to the specifics of a particular case study and can generate transferable insights into the potential efficiencies and wider benefits of co-production.

Incorporate knowledge-based practice
As well as rooting the evaluations in a stronger theoretical foundation, it is also useful to make explicit the focus on what Glasby and Beresford (2006) termed knowledgebased practice.These authors reject evidence hierarchies and argue that 'the "best" method for researching any given topic is simply that which will answer the research question most effectively' (Glasby, 2011, 89).This approach enables them to argue that 'the lived experience of service users or carers and the practice wisdom of practitioners can be just as valid a way of understanding the world as formal research (and possibly more valid for some questions)' (Glasby, 2011, 89).As Glasby puts it, '… some research questions mean that proximity to the object being studied can be more appropriate than notions of "distance" ' (2011, 89).He makes the case for greater use of experiential evidence, in contrast to empirical evidence, 'for example, how the process is viewed and experienced by service users and staff whose behaviour shapes and contributes to empirical outcomes… ' (2011,.
This approach offers a way to draw on the insights of the people working in co-productive ways, rather than assuming that they are too 'close' to the case study to be able to offer valid insights.Recognising the credibility of people working in co-productive ways is also a way of more explicitly acknowledging the value base which underpins co-production, in which traditional notions of professional expertise are complemented by the expertise of lived experience.From this perspective, coproduction could be beneficial even if outcomes and spending remain stable.For example, co-productive ways of working could enhance the skills and sense of efficacy of participants, and foster the emergence of new social movements.Organisations seeking to evaluate their own examples of co-production could do so in ways which are explicit about the underpinning value base and the benefits gained by proximity as well as distance for understanding what works.

Gather 'good enough' evidence
Theory-based evaluation that makes explicit how co-production is likely to operate, alongside a more overt articulation of knowledge-based practice, is only part of what is needed to build support for co-production.To strengthen the evidence base for coproduction it is also necessary to identify pragmatic and cost-effective ways to gather evidence.Many co-productive activities will be small-scale and will be undertaken by organisations operating outside of the core of central and local government.These are not initiatives that are ever likely to command a formal government evaluation.Such groups need to better understand the sorts of evidence which are most likely to carry weight with the decision makers operating in their sector (local authority commissioners, councillors, civil servants, ministers, and so on), recognising that 'some groups seem to reject forms of evidence or information that others see as potentially valid' (Glasby 2011, 94).As Sullivan puts it, 'There are multiple sources of evidence and hence multiple "truths", to engage with policy makers as "policy entrepreneurs"… ' (Sullivan, 2011, 509).Three approaches which might be useful for case study evaluators are: appreciative inquiry, peer-to-peer learning and data sharing.

Appreciative inquiry
Appreciative inquiry is a process that, 'promotes positive change by focusing on peak experiences and successes of the past' (Mathie and Cunningham, 2002, 7).It draws on educational psychology about the sources of collective and personal motivation (Mathie and Cunningham, 2002, 7) and aims to challenge internalised negativity and move towards a more appreciative construction of a community, organisation or situation.Appreciative inquiry utilises the 'heliotropic principle' (Elliott, 1999): 'just as plants grow towards their energy source, so do communities and organisations move towards what gives them life and energy.To the extent that memory and the construction of everyday reality offer hope and meaning, people tend to move in that direction' (Mathie and Cunningham, 2002, 7).
Appreciative inquiry is associated with asset-based community development (ABCD) due to its shared 'commitment to discovering a community's capacities and assets', rather than a 'community's needs, deficiencies and problems' (Kretzmann and McKnight, 1993, 1).Theoretically, an appreciative inquiry approach to evaluation is close to social constructivism; where sense making and meaning are achieved through dialogue and interaction (Coghlan et al, 2003, 17).
Appreciative inquiry uses interviews and storytelling as a way of drawing out positive experiences and memories, and then relies on a collective identification and analysis of critical elements of success.Storytelling is particularly important in coproduction, not only in evidencing the significance of its relational dynamics but also in representing different voices and experiences in an accessible way (Durose et al, 2013, 22).Storytelling also helps in building shared commitment and understanding (Layard et al, 2013), and in identifying community successes and identifying the capacities of communities which contributed to those successes (Mathie and Cunningham, 2002).The potential of using stories as part of an evaluation approach is demonstrated by a hospital in the North-West of England (Cumbria Partnership NHS Foundation Trust, 2012, cited in Durose et al, 2013, 22) which collected patients' stories of their experiences of healthcare in the community and community hospitals, and shared them with staff as part of a learning and development programme.The initiative was a way to circulate patient experiences and demonstrate how simple misunderstandings can impact on that experience.The stories were then used to make a short film raising the question, 'do you always see the person in the patient?' In another example of appreciative inquiry, a structured dialogue method was used by a third-sector organisation in Birmingham, seeking to use storytelling as a basis for evaluation.The structured dialogue method is a technique for listening critically to stories and using them in policy development and evaluation: 'Stories don't just reflect the culture: they ARE the culture.If you want to change culture, you need to change the stories and the way people tell them' (Slatter, 2010, cited in Durose et al, 2013, 22).Key elements of the approach involve: a provocative theme -something to generate animated discussion; a diverse storytelling circle of around ten to fifteen people; two storytellers willing to share their experience; active reflection of all participants -not just the storytellers; structured questioning -not general discussion; and a skilled facilitator to manage the process (Slatter, 2010, cited in Durose et al, 2013, 22).
Proponents of appreciative inquiry suggest that its use as an evaluation approach can have benefits beyond traditional approaches.As with traditional approaches, appreciative inquiry evaluation is able to measure change over time, as the process 'develops [a] programme logic model, clarifies the evaluation purpose, identifies the stakeholders, determines the evaluation key questions, develops indicators and develops evaluation plan' (Preskill and Tzavaras Catsambas, 2006).However, it is different from other approaches because it encourages the use of stories and the focus is on what is working rather than what is not.The approach also provides an organisation with a 'process by which the best practice of the organization can become embedded as the norm against which general practice is tested' (Elliot, 1999, 202).Additionally, it goes beyond the conventional evaluation and integrates evaluation results into the future actions of the subject being evaluated (Ojha, 2010, 11).Watkins and Mohr (2001, 183) suggest that as any intervention shapes the future direction of an organisation, the first questions that are asked are vital, so: In an evaluation using an appreciative framework, the first questions asked would focus on stories of best practices, positive moments, greatest learnings, successful processes, generative partnerships, and so on.This enables the system to look for its successes and create images of a future built on those positive experiences from the past.(2001,183) An appreciative inquiry approach may have specific benefits in evaluating the success of new or developing initiatives, or where an organisation may be unsure of itself.Coghlan et al (2003, 202-3) suggest that the approach is much less threatening and judgmental than many variants of traditional evaluation for it invites the staff -and indeed, in theory, all the stakeholders -to reflect on their best practice rather than to admit their failures and unsolved problems.
A criticism of appreciative inquiry as an evaluation approach might be that it ignores problems.However, evidence suggests that appreciative inquiry does address issues and problems, but from a different and often more constructive perspective: It reframes problem statements into a focus on strengths and successes.A traditional evaluation technique might ask participants to carry out a diagnostic into what is not working well, whereas [appreciative inquiry] will ask them to explain what is going well, why it is going well, and what they want more of in the organization'.(Coghlan et al, 2003, 6) Coghlan et al (2003) suggest that there are some contexts in which an appreciative inquiry approach has the most potential as an evaluation tool.Several of these are relevant to the evaluation of co-production approaches, including: • where previous evaluation efforts have failed • where there is a fear of or scepticism about evaluation • with varied groups of stakeholders who know little about each other or the programme being evaluated • when it is important to increase support for evaluation and possibly the programme being evaluated Although appreciative inquiry is offered here as a 'good enough' evaluation approach, it should not be any less challenging or rigorous than traditional approaches.Rogers and Fraser (2003) note that an appreciative approach to evaluation requires specific skills, without which it could lead to vacuous, self-congratulatory findings.They suggest that when multidisciplinary teams are being assembled, consideration should be given to including members who have the affirming types of skills needed to apply the technique properly.

Peer-to-peer learning
A second 'good enough' approach to gathering data can be to utilise peer-to-peer learning.Peer learning is 'as old as any form of collaborative or community action, and probably has always taken place, sometimes implicitly and vicariously' (Topping, 2005, 631).Philosopher, psychologist and educational reformer, John Dewey (1916) argued that 'education is not an affair of "telling" and being told, but an active and constructive process'.Peer learning can be defined as: the acquisition of knowledge and skill through active helping and supporting among status equals or matched companions.It involves people from similar social groupings… helping each other to learn and learning themselves by so doing.(Topping, 2005, 631).
This constructivist view asserts that knowledge is created by experience.It provides a counter-hegemonic force by challenging the embedded knowledge hierarchies of the expert versus the layperson (Porter, 2010) and in doing so shares and mirrors the aspirations of co-production.For example, the noted work on critical pedagogy by Paulo Freire (1996) uses critical dialogue and reflection on shared experience as a basis for intervention and action.
Whilst peer learning has been under-theorised traditionally, more recent theorybuilding and synthetic analysis has helped to identify the sub-processes which influence the effectiveness of peer learning and understand why it may not only be a pragmatic means of gathering evidence, but may also promote 'more effective onward learning' (Topping, 2005, 638).These sub-processes include: organisational and structural features, for example the need for both parties to elaborate goals and plans; the cognitive conflict and challenge involved, for example dispelling of myths and testing of assumptions; scaffolding, in the sense of support from a more competent and experienced other; communication demands which encourage and develop those skills ('a participant might never have truly grasped a concept until having to explain it to another, embodying and crystallising thought into language' ( (Topping, 2005, 637))); and the affective component of peer learning: A trusting relationship with a peer who holds no position of authority might facilitate self-disclosure of ignorance and misconception, enabling subsequent diagnosis and correction… modelling of enthusiasm, competence, and the possibility of success can influence the self-confidence of the helped, while a sense of loyalty and accountability to each other might help to keep the pair motivated and on-task.(Topping, 2005, 637-8) Evidence has pointed to the cost-effectiveness of peer learning as a learning strategy (Levine et al, 1987, cited in Topping, 2005, 635).
Research and evidence on peer learning is concentrated in the field of education, focusing on interaction within a formal education setting.However, there is increasing interest in peer learning within organisational studies, focusing on the generation and spread of innovation and good practice.Research suggests that ideas are spread through horizontal connections, such as geographical proximity or regional identification, socioeconomic equivalence, political similarity, and psychological identification (Brannan et al, 2008, 26).However the role played by informal interpersonal contacts and networks of near peers in spreading new ideas is also recognised (Kolb and Fry, 1976;Page et al, 2004;Brannan et al, 2008).Such contacts are considered, 'the most truthful and useful sources of information' (Wolman and Page, 2002, 27).For the purposes of gathering data and evidence, this research suggests the importance of careful matching.
A 'critical friend' role is one way to formalise the peer learning experience (Swaffield and MacBeath, 2002).Costa and Kallick (1993, 50) suggest that a critical friend asks provocative questions, provides data to be examined through another lens, and offers critiques of a person's work… takes the time to fully understand the context of the work presented and the outcomes that the person or group is working toward.Costa and Kallick (1993) see a critical friend as a potential advocate, as Smith reinforces, a critical friend can articulate and bring out into the open aspects of the project that may… enhance its impact and thereby assist it to gain recognition for its achievements.(2004,344) Peer learning is an approach with growing potential.The assumption of peer learning as a spatially anchored relationship between near peers is opened up through technological tools, and applications such as wikis, free open source software and the internet now provide platforms for peer learning and data generation.Peer-topeer learning may also foreground communities of practice.Wenger et al define communities of practice as groups of people who share a concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise in this area by interacting on an ongoing basis.(2002,4) These practice-based communities can be found throughout history and continue to proliferate today.In this way, the benefit of peer-to-peer learning lies not only in the generation of evidence, but in the scaling out of innovation and good practice.
Government initiatives are beginning to experiment with peer-to-peer learning not only to 'make a case' but to spread ideas and innovation (CLG, 2013).For example, Our Place is a programme developed by the UK's Department for Communities and Local Government and Locality, a national network of community organisations, to support approximately 100 neighbourhood projects across England and Wales.Each project identifies a thematic priority, such as social care for older people, local employment, or after-school provision, and then aims to give local people more power over the services, budgets and outcomes in their neighbourhood.The design of the Our Place initiative recognises that 'what is most powerful is direct contact, one area visiting another, asking questions, seeing work in practice' (Durose et al, 2013, 28).It is premised on building trusting relationships between projects so that they 'believe it's real… that [it's] people like me' (Durose et al, 2013, 29).
As part of the package of support provided to neighbourhoods, each Our Place project is provided with a 'critical friend' in the form of a relationship manager who provides coaching and mentoring and is paired with a pilot project or 'Champion', which is further developed and can offer support and guidance.Each Our Place project is also matched with others working on a similar theme and those in the same region.Ideas, inspiration and problem-solving are shared through face-to-face visits and fora, online through wiki discussions and by offering peer comments and suggestions on the operational plans being developed (Locality, 2014).

Data sharing
Among the most significant developments in social science in recent years has been the explosion of interest in Big Data and associated forms of quantitative analysis of secondary data -a growth that potentially spans the boundary between academia and practice and offers opportunities for 'good enough', dispersed and cost-effective forms of evaluation.The rise of Big Data reflects the widespread availability of technology to collect and analyse huge amounts of data.Volume is clearly a distinctive element of Big Data but so too is its 'completeness', its dynamic nature, captured in real time and over time, and its potential to reveal 'real' events rather than those (selectively) reported through interviews and surveys (Tinati et al, 2014;Hale and Margetts, 2012).It is sometimes more simply characterised as high-volume, high-velocity and high-variety information (de Las Casas et al, 2013).The ability to make sense of Big Data is also dependent on new computational technologies -particularly mass storage, data linkage and speedy analysis -that allow the analysis, visualisation and presentation of useful information derived from often highly heterogeneous datasets (McDonnell, 2014).
Critics of Big Data have pointed to limitations and restrictions amongst the celebratory rhetoric.From a scholarly -ontological and methodological -perspective, and despite some claims to the contrary, Big Data hasn't displaced hypothesis-driven enquiry, nor has it removed the necessity to consider hidden bias and subjectivity in datasets, and as ever evidence of correlation needs to be matched by theoretical understanding and other tools (McDonnell, 2014).To paraphrase Uprichard's pithy assessment ( 2013), there is a risk that Big Data concerns itself with answering small questions.Reflecting on the scope for the 'new dawn' of Big Data to change evaluation practices, Gopalakrishnan et al (2013, 11) note: Why convene a focus group when you can just analyze Twitter feeds?Moreover, why hire an evaluator at all when a well-structured algorithm can draw the same conclusions?We should keep in mind, however, that the phenomenon of Big Data is still very new and we need to address several issues related to privacy, accuracy, reliability, and use.
Given these concerns it is important to consider the closely-related 'open data' agenda linked to the opening up and greater use of secondary administrative data: that is, often previously confidential or simply inaccessible government-held information.Though arguably a phenomenon occurring in much of the developed world, the UK Coalition government  in particular promoted open data as part of its efforts to diversify provision, improve the assessment of outcomes, and increase transparency around service delivery (Cabinet Office, 2012).Partly in response, UK Research Councils have also promoted greater exploitation of secondary data sets.Open data is 'data that can be freely used, reused and redistributed by anyone -subject only, at most, to the requirement to attribute and sharealike' (open data handbook, quoted in de Las Casas et al, 2013).
There is potential for third sector organisations engaged in co-productive activities to use open data for benchmarking their own performance metrics if they have the requisite data analysis skills.However, there are a number of barriers to this.In many of the contexts in which co-production may play a role, a key barrier is likely to be confidentiality and data security, adding greatly to the cost and complexity inherent in the access, manipulation and analysis of such datasets.Just as limiting might be lack of awareness of existing resources and mechanisms to exploit them -the lack of obvious demand further hampering efforts to open up data that may well have useful applications.
One recent innovation, which appears to provide a solution to these varied issues of access and capacity, is the development of the Data Lab approach -a way of linking data held by smaller provider organisations with relevant administrative data in a secure setting, in order that they can establish the effectiveness of their interventions (de Las Casas et al, 2013).The foremost practical exemplar has been the establishment by the UK Ministry of Justice of a 'Justice Data Lab' (JDL), an internal unit containing evaluation and statistical expertise which provides analyses to voluntary sector, social enterprises and public and private sector organisations involved in providing services to reduce re-offending.The JDL supplies measures of re-offending for cohorts of individuals provided by organisations working in criminal justice, alongside re-offending measures for a matched comparison group of offenders selected through propensity score matching'.(Lyon et al, 2015) Within the report provided to provider organisations, the two re-offending measures are compared in order to determine the extent to which there has been a statistically significant change in re-offending in the target group.
Additional data labs are being explored in other fields of public service delivery such as health, substance misuse and homelessness (Gyateng et al, 2013).Although there appears to have been little activity directly addressing the effectiveness of coproduction, the relatively flexible (though complex) way in which the data lab model can be implemented does suggest that there is potential, and it is an area in which further innovation is likely to occur.The experience with the JDL to date suggests it provides a more cost-effective platform in which traditional evaluation approaches can be bypassed -provider organisations are provided with the analysis at no cost, and of course are spared having to employ data analysts for the specific task.In this way the responsibility for evaluation is more dispersed, with organisations demonstrating directly with appropriate administrative data the success or otherwise of their approach.Where favourable the subsequent knowledge can be deployed by the organisation to convince commissioners they should be awarded further contracts, or to suggest that alternative approaches are needed (de Las Casas et al, 2013).

Conclusion
The scarcity of independent studies of the co-production of public services reflects the cost and time implications of commissioning independent evaluation, particularly for the third-sector provider organisations that have led the field in trialling coproduction.The approaches set out here offer ways in which the evidence base for co-production can be strengthened.For policy entrepreneurs seeking to make the case for co-production, there is a range of strategies that can be utilised.The 'scaling out' approach offers sharp contrast to traditional 'scaling up' approaches to spreading innovation which are underpinned with the aim of maximising efficiencies through economies of scale (Bunt and Harris, 2010;O'Donovan and Rubbra, 2012).A scaling out approach depends on arguments and evidence being presented in a way which resonates with the audience's lived experiences and values.
There also needs to be greater understanding of the role of values and argument in the policy process, and the limits to what evidence-based policy making can achieve.EBPM is too often a narrowly instrumental approach which privileges the measurement of impact and outcomes, failing to capture the relational possibilities of co-productive ways of working, and denying the inevitably political nature of evaluation (Wells, 2007).The call for 'tolerating epistemological and methodological diversity' made by Lambert (2014) in a public health context is just as relevant here.Crucially, if co-production is about involving the public in meaningful action around public service change, it cannot rely on forms of evidence gathering which are accessible only to a cadre of trained evaluators.
Rather than accepting a dichotomy between 'cosy stories of a few people's gains' (a phrase Beresford (2008) used to describe some of the personalisation evidence, but which could equally be applied to co-production), or a cost-benefit analysis worthy of the Magenta Book (HM Treasury's guide to undertaking evaluations), this paper has suggested pragmatic approaches for small-scale evaluation based on theory, values and attentiveness to the audience.Co-production may be hard to avoid in debates around public service reform but it needn't be hard to evidence.