Towards Standardization: A Participatory Framework for Scientific Standard-Making

In contemporary scientific research, standard-making and standardization are key processes for the sharing and reuse of data. The goals of this paper are twofold: 1) to stress that collaboration is crucial to standard-making, and 2) to urge recognition of metadata standardization as part of the scientific process. To achieve these goals, a participatory framework for developing and implementing scientific metadata standards is presented. We highlight the need for ongoing, open dialogue within and among research communities at multiple levels. Using the Long Term Ecological Research network adoption of the Ecological Metadata Language as a case example in the natural sciences, we illustrate how a participatory framework addresses the need for active coordination of the evolution of scientific metadata standards. The participatory framework is contrasted with a hierarchical framework to underscore how the development of scientific standards is a dynamic and continuing process. The roles played by ‘best practices’ and ‘working standards’ are identified in relation to the process of standardization.


Introduction
Expectations for the sharing and reuse of scientific data are increasing, as evidenced by current attention to data access technologies, funding agency data policies and data management education.Though data in the natural sciences are heterogeneous and difficult to standardize (e.g.Madin et al., 2008;Reichman et al., 2011;Mayernik, 2011;Willis et al., 2012), data sharing and reuse depend in part on metadata standardization through the development and enactment of metadata standards.The development of scientific metadata standards is sometimes mistakenly perceived as a well-defined, one-time undertaking.While efficiencies-of-scale drive the process for developing standards in economic arenas, such frameworks do not adequately characterize the process of developing metadata standards for scientific data.
To open up discussion, we introduce a participatory framework for the development of science metadata standards grounded by respect for complexities-of-scale.A framework that is explicitly participatory: 1. Opens up standards development efforts to ensure broader representation from science communities, 2. Formalizes collaboration as a necessary component of standardizing scientific data descriptions, and 3. Anticipates management of a trajectory of change throughout the process.
We argue for envisioning standard-making as an evolving, continuing design process, as opposed to a set of defined steps leading to a solution in the form of an enduring standard.The term 'framework' is used instead of 'standards lifecycle' in order to emphasize the complexity and relationships within broader, longer-term standardization efforts.Our aim is to foster dialogue and increase awareness of processes as central to developing useful and usable scientific standards.
The perspective of the authors is that of practicing information managers in the natural sciences.Years of experience with researchers, science networks and data centers have shown how conceptual development and inclusive, collaborative efforts are essential to standardization in the sciences.

Background
Standards are mechanisms of coordination that facilitate the exchange and comparability of information and products in the global community.Mature standards that are broadly accepted and implemented become authoritative resources for infrastructure development.As they exist across a number of arenas and carry a wide variety of functions, the study of standards is a multi-faceted endeavour.Previous work has addressed the classification of standards, as well as the tension between standardization and flexibility (Sherif, 2001;Hanseth et al., 1996;Molka, 1992;Lehr, 1992;deVries, 2005).The development of all types of standards and the slowness of the process are topics of research (Bowker & Star, 1999;Lampland & Star, 2009;Jakobs, 2005Jakobs, , 2010;;Lyytinen & King, 2006).Standards were developed early on as physical specifications required in engineering and economic arenas -weights, measures and strengths (Thompson & Taylor, 2008).In the current era of global interconnectivity, standards are needed for the exchange and coordination of digital information, as well as material goods.deVries (2005) assembled a diverse set of definitions that reveal the multiple roles for standards as the need for standards beyond industry arenas was recognized (Rumble et al., 2005).The European Information and Communications Technologies Standards Board (ICTSB) defines a standard as a formal document within an existing context: "[A] document, established by consensus and approved by a recognized body, that provides, for common … use, rules, guidelines or characteristics for activities or their results, aimed at the achievement of the optimum degree of order in a given context."1

Hierarchical Framework for Development of Standards
Commercial and engineering standards play a significant role in successes and failures of systems and infrastructures (e.g.Edwards, 2003;Edwards et al., 2007;Star & Ruhleder, 1996).Schmidt and Werle (1998) report: "The availability of standards not only helps firms to achieve economies of scale; it also facilitates the building of comprehensive, integrated technical systems." The goal of commercial development is to create a final, fully-functional product quickly.Participants in this standard-creation process include large national or international bodies with economic and political perspectives.As in the case of typing keyboards (Lewin, 2002;David, 2007) and mobile communications standards (Akhtar, 2005;Fomin, 1999), the process frequently involves selection of one approach from existing options.In a stylized presentation, Figure 1 presents a hierarchical framework for development of a standard using a stepwise 'top-down' approach.The upper boxes (labelled 'a' and 'b') signify exclusive agents who undertake creation of a standard through bounded, short-term discussions driven by efficiencies-of-scale.In this case, the process of creating standards is about reducing differences through elimination or selecting from alternatives, and/or through consolidation by generalizing details.The process is influenced by political considerations, limited time and desire to maximize return on investment.Once a standard has been finalized, the development process is complete, the standard is published and support for discussion ends.Individuals and groups (denoted 1, 2, 3) have access to the standard.

Metadata Standards in a Scientific Context
Shifting to a research context in the natural sciences, we find an arena of extreme complexity.To study the nonlinearities and interdependencies of interacting, living systems, scientific research incorporates rapidly changing observational methods that contribute to knowledge generation.The research process requires an adaptable infrastructure for the organization of scientific data described by metadata (Greenberg et al., 2009;Greenberg, 2010).The National Academy of Science defines metadata standards as descriptions of: "...the content, context, and structure of information objects, including research data, at any level of aggregation (for example, a single data item, many items, or an entire database)."(NAS, 2009).
In considering processes of collaboration and development for science metadata standards, we expand from technical factors and end products to socio-organizational factors and design.
The development of science metadata standards involves many different socio-organizational units, defined here as individuals or groups with their own situated culture, practices, resources and needs.Examples of socio-organizational units include a researcher's lab, a field project, a data center, a government agency, a field station, or a research community.Bringing these units together despite their differences is challenging, though there are a number of approaches that can help.Design is an integrative activity able to address complex issues as well as a multiplicity of organizational arrangements and political agencies (Green, 1996;Berg, 1998;Millerand et al., 2013).Participatory design highlights local engagement and expands to the long-term in continuing design (Schuler & Namioka, 1993;Dittrich et al., 2002;Karasti et al., 2010).Whelton and Ballard (2002) identify three design activity models that provide a context for learning: rational problem-solving, social process, and an experiential process.Efforts in the field of information systems design have explored a variety of approaches to support individual learning and group dialogue (e.g.Wagner et al., 2010;Millerand & Baker, 2010;Lyytinen, 1987).Design-oriented approaches for soft systems methodologies take broad views of human as well as technical requirements for information systems (e.g.Dix et al., 2004).Explicitly including human expertise in design helps to ensure that 'bottom-up experience' informs development of top-down, universal systems and standards (e.g.Hasselbring, 2000).In identifying standard-making as a design issue that crosses The International Journal of Digital Curation Volume 8, Issue 1 | 2013 many socio-organizational levels, the dynamic interplay of social, political and learning processes in the natural sciences is acknowledged and accommodated.
Heterogeneous data result from the many technologies, domains, and diverse sampling and processing methods used to study complex natural systems.Differences encountered in science are not necessarily seen as non-conformity or a failure to unify (Cragin et al., 2010;Star, 1991;Gasser, 1986).Instead, heterogeneity is appreciated as that which informs scientists about systems at a variety of scales, though data reuse, automation and even discovery are made extremely difficult by non-standard data and metadata.Complexities-of-scale define the scientific context, though economies-of-scale, such as automated data processing and integration, are recognized as goals.Heterogeneous data and anomalies are features of the scientific process that require social and cultural as well as technical adaptation in standard-making efforts.Scientific metadata standards help document and mitigate heterogeneity.However, an existing multitude of standards has not been sufficient to achieve the promise of scientific data integration and reuse.

Metadata Standards in Practice
Metadata development initiatives have been underway for some time in national and international arenas.The Text Encoding Initiative (TEI) for marking up humanities documents was established in 1987, Dublin Core Metadata Initiative for cataloguing web-based resources started in 1995, and Darwin Core for documenting specimen data emerged around 1999.The Data Documentation Initiative (DDI) was funded in 1997 for describing social, behavioural and economic science metadata.Within the observational sciences, the Federal Geographic Data Committee (FGDC) was established in 1993 and created a Content Standard for Digital Geospatial Metadata (CSDGM) for describing geospatial data.Subsequently, a paper by an Ecological Society of America committee was published on non-geospatial ecological data (Michener et al., 1997), from which developed the Ecological Metadata Language (EML).Each standard has a story that provides insight into development efforts; we explore the EML story of scientific standard-making in practice.
The interpretation and enactment of EML by the Long-Term Ecological Research (LTER) program information managers for describing field ecology measurements provides an example of ongoing design activities.The LTER Network began in 1980 with a data manager included in each of six research teams.Today, the network has 26 sites and an Information Management community-of-practice crossing disciplines, as well as socio-organizational scales in a complex set of long-term arrangements.Continuity and oversight of EML are anchored at a technical center (Michener et al., 1997;Jones et al., 2006), while enactment activities are carried out at individual sites.EML was adopted by the Ecological Society of America and the Organization of Biological Field Stations; the LTER network adopted EML during the LTER's 2000-2010 'Decade of Synthesis' (Millerand & Bowker, 2009).The initial EML design process was open in theory but closed in practice, given a 'fire-hose' of information precluding real-time participation by data providers; this resulted in a more a hierarchical approach to development of the standard (Karasti et al., 2010).As part of the enactment process, review criteria were developed against which site metadata quality and completeness could be judged (LTER IM, 2011;2009).At the time however, important questions were only beginning to be formulated, i. metadata is needed?' and 'what is a dataset?'As understanding evolved, implementation participants realized that EML would require further refinement.Over time, LTER Information Managers worked with local scientists, shared site experiences and practices, and negotiated the standardization of content across sites.Information managers collaboratively developed a keyword controlled vocabulary (Porter, 2010) and a unit registry (Karasti et al., 2010), while scientists discussed language ambiguity and network-level developers pursued ontology-related data integration (Michener & Jones, 2012).As a case example, EML illustrates the interplay of hierarchical and participatory standard-making activities over time.
The development timeframe for EML was more than five years; this was followed by an even longer period of redesign, redeployment and re-enactment of the standard.The integrative work of identifying collaborative arenas, articulating requirements, as well as sharing, communicating and negotiating across differing participant needs, traditions and languages all take time.The process of standardization, including the enactment of standards, is best recognized in the sciences as occurring at all times and at differing paces in various spheres-of-context (Baker & Yarmey, 2009).As a result, the implementation of metadata standards is far less definitive and immediate than might be anticipated.

Challenges
Lack of adequate consideration for development frameworks makes it difficult to map expectations for science metadata standards to true standardization.Defaulting to a hierarchical development framework within a scientific context impacts timelines, roles and expectations; the ramifications can be extensive.Introducing metadata requirements prematurely can lead to rushed and incomplete data representation, misguiding future efforts and defeating the objective of data reuse.Adoption of a 'young' metadata standard can stifle capture of contextual metadata at the local level, for example by missing new facets of methodology or integration-enabling elements, as is currently the case with biological units.This can be perceived as prioritizing standards over scientific needs, exacerbating researcher disenfranchisement and widening the cultural gap between data professionals and scientists.Mismatching standards and functionalities is an issue as well; applying a metadata standard meant for discovery to capture deep contextual metadata may lead to inappropriate cross-walking of detailed data descriptions into general metadata fields.Further, high-level standards that generalize data description can increase provider concerns about data misuse.Implementing a standard is further complicated in cases where the evolution of the standard is faster than enactment efforts, so that enactors are constantly playing catch-up.Adding urgency, the phenomena of path dependence (Puffert, 2010) describes the situation where an approach introduced early on may become 'locked in' regardless of its usefulness (Edwards et al., 2007;David, 2007).

Proposing a Participatory Framework
Considering the creation and implementation of standards as an activity completely separate from the conduct of science divides resources and increases the communication gap between metadata efforts and science.To increase bi-directional engagement in scientific metadata work and to mirror the conduct of science, we The International Journal of Digital Curation Volume 8, Issue 1 | 2013 propose a participatory framework for scientific metadata standard-making, as shown in Figure 2. Aligning with scientific research processes introduces an opportunity to partner with researchers in describing their science and data, rather than imposing external metadata requirements.The proposed participatory framework has several key characteristics appropriate to the research context.First, the process is open and inclusive.Second, the process is designed to maximize capture of scientific understanding rather than simply implement a standard.Third, scientific heterogeneity is consciously identified and strategically managed through informed consensus and negotiation.As with scientific work, the focus is not solely on final outcomes but on awareness and reconciliation of differences.Fourth, the process is ongoing, reflecting the long-term process of learning.The longer planning window allows flexibility for diverse participants to anticipate change and align as needed, given local circumstances and resources.Circles 1, 2, and 3 on the left of Figure 2 symbolize three different formal or informal socio-organizational units that could include scientists, librarians, standard-making bodies and others.Emerging roles, such as data scientists, data managers and data curators, contribute key mediation work.Each socio-organizational unit has its own sphere-of-context, characterized by a unique configuration of organizational placement, project definition, technical aptitude, resources, research perspective and available infrastructure.Rather than needing to have a specific skill set, participants with all types of expertise are sought out and welcomed.In addition to being participants in development of a scientific metadata standard, those engaged in the process are continuing learners.Standard-making participants co-construct an evolving scientific understanding by carrying out data curation in a research context.
The arrows on the lower left of Figure 2 depict the reporting or sharing of participants' practices, needs and experiences with metadata (whether it is called 'metadata,' 'documentation,' or another term).Participants contribute expertise and lessons learned from their situated work.This early communication addresses the cultural need for relationship-building, increases participant buy-in, and offers each participant the understanding of future changes and process timelines.Formal and informal information exchange enable comparison across collections and perspectives.Comparison leads to categorization.Differences exposed through reporting can be prioritized for classification and possible consolidation.Analysis of exposed similarities and differences can provoke convergence in scientific understanding.Volume 8, Issue 1 | 2013 Moving along the lower part of Figure 2, negotiation occurs at the point of convergence.Faced with disparate metadata, participants can decide upon reconciliation through an agreed-upon standard.Possible negotiation outcomes include resolution or postponement, in addition to elimination and generalization.Ideally, negotiation is inclusive, with each participant representing local priorities, infrastructures, resources and needs.Negotiation points and agreements are articulated and captured through documentation.Documentation of scientific metadata standards and processes serves the same purpose as peer-reviewed publication; it keeps science open and cognizant of decisions that shape community understanding and research trajectories.

The International Journal of Digital Curation
The outcome of the negotiation stage is a strategically released version of a standard (seen in the right-hand box).Necessary tools, guidelines and other implementation mechanisms are identified and addressed through collaborative efforts.As partners in the process, participants are empowered to take their new knowledge back to their local community for incorporation in decision-making and planning.Once deployed (upper right arrow, Figure 2), the implementation process includes expected divergence, as each socio-organizational unit separately interprets the standard in their respective sphere-of-context.Enactment (upper left arrows) involves essentially a site-specific implementation cycle of design, development and deployment, taking into account local circumstances (Millerand et al., 2012;Millerand & Bowker, 2009).Since the situated practices, systems, and infrastructures -human, cultural and technological -differ, the standard and any new requirements must be aligned with existing practices.Any socio-organizational unit may find there are multiple ways to integrate the new standard with their own existing practices; formal or informal governance structures are used to decide among options.With standardization occurring in different ways depending upon local contingencies and readiness, timeframes and outcomes will differ as well.
Time may be needed for local redesign and to develop best practices for use of the standard.Best practices develop from local decisions made in interpreting a standard to ensure that local circumstances are managed uniformly (see Figure 2).A working standard, an ad hoc convention developed pragmatically to describe locally-developed procedures in response to a local need, may already exist and require modification or replacement as shown on the left of Figure 2.Alternatively, a site-specific working standard may develop as an enhancement to a community standard.As noted by Gasser (1986), local adjustments may appear initially as 'out of specification' but in time contribute via the proposed participatory process.
Socio-organizational units with adequate support for information systems are in the unique position of developing site-specific metadata schemes that can be mapped to a variety of existing and emerging metadata standards.This enables site compliance with multiple standards, while insulating local development efforts from external factors.Research-focused, local development efforts can adapt more rapidly than large-scale efforts and, being closer to the science, can better capture metadata requirements for data reuse.Sharing these situated experiences and understandings with others contributes to another round of the ongoing standard-making process.

The International Journal of Digital Curation
Volume 8, Issue 1 | 2013

Realities of Participatory Approaches
The participatory development of scientific metadata standards is distinguished from a hierarchical approach by the framework characteristics in Table 1.A hierarchical framework within a competitive market culture develops specifications in order to create an end product rapidly.Alternately, a participatory framework within a research culture develops integrative mechanisms over the long term that facilitate learning and support scientific inquiry.While participatory design can enhance thinking about standard-making, short funding cycles favor product-driven, hierarchical frameworks.Resources, infrastructure and expertise within the natural sciences are generally lacking as yet for inclusive activities, cooperation across socio-organizational units, and cross-community dialogue and engagement.Emerging roles for social scientists (e.g.Ribes & Baker, 2007), data curators (e.g.Cragin et al., 2010;Heidorn, 2011;Thompson, 2012), and data managers (e.g.Michener et al., 2011;Baker & Chandler, 2008) formalize engagement with scientists on these issues and integrate understanding back into the standard-making process.It is also important to consider how the many scientific socio-organizational units engage with technical agencies and thereby shape subsequent requirements and policies.
The participatory process does not come without difficulties.It is time consuming, costly, and leans heavily on governance structures that often don't yet exist.Organizing units are needed as decision-making bodies.Lower (2010) highlights two dimensions: a coordinating unit (community, consortium, government, market) and a standards object (process, semantics, performance and product).The kind of organizing unit for a standard influences the process through decision-making The International Journal of Digital Curation Volume 8, Issue 1 | 2013 approaches, i.e. expert advice, widespread participation, consensus.Further governance is required to coordinate multiple organizing units when addressing a single standard.
Despite the difficulties associated with standard-making, we have promising examples at hand to consider.In addition to the LTER EML case example illustrating a hybrid model (hierarchical with participatory follow-up), the field of health science has early experience in community-based participatory research (CBPR), an approach that recognizes and addresses the value of engaging with diverse public participants (e.g.Israel et al., 1998).Envisioning the metadata standardization process in scientific research using a participatory framework highlights the need for early planning of time, resources, expectations and infrastructure to support incremental changes throughout an ongoing process.

The Process of Standardization
'Standardization' refers to the process of aligning standards, practices and content within and across socio-organizational units.Hierarchical standardization involves establishing compliance to set specifications, while scientific metadata standardization requires a participatory strategy for growing a shared pool of knowledge.Standardization is essentially a process of learning; Ribes and Finholt (2007) note: "...the creation of a metadata standard could be viewed as a form of new knowledge."(Ribes & Finholt, 2007).
The creation of science metadata standards developed in a participatory framework will necessarily involve negotiations to resolve disparate understandings and forge consensus through common terminology.The participatory framework presents an opportunity to articulate and develop common views, despite layers of complexity, thus linking standard-making more closely to standardization.
Collaboration is at the heart of human organization and growth of infrastructure (Lee et al., 2006;Bowker et al., 2010); it can take many forms depending on participant needs.While some commercial and scientific standards remove much of the need for interpersonal communication between those wishing to interoperate, the ongoing process of metadata standardization across the complex realms of science requires users and developers to work as co-designers in continuing collaboration.

Levels and Scales
Scientific research occurs in diverse socio-organizational units, where the work of standardizing local practices is ongoing for each unit through a variety of activities.With both overlapping and contradictory interests, each unit has a sphere-of-context defined along local-remote, higher-lower, and/or large-small continuums that collectively may be called levels.'Level' refers to an ordering with differing scopes and responsibilities for the work that occurs in different contexts.From the perspective of an individual lab, a multi-investigator project is higher-level or more remote.However, from the perspective of a domain, a multi-investigator project may be at a lower level and of narrower scope.Communication and standardization take place both across units at the same level and across differing levels.

The International Journal of Digital Curation
Volume 8, Issue 1 | 2013 Fomin (1999) argues that the standard is a boundary-crossing object that enables integration across different organizational levels.His multi-level analysis considers the roles and function of standards in terms of micro-meso-macro scaled categories associated with social, technical, and economic forces, respectively (Fomin, 2003).With scientific metadata standards, we map instead to three socio-organizational levels: local, community and large-scale.Scientific metadata standards develop and are enacted at all levels, eventually interfacing with local, or micro, best practices and prompting development of working standards.Community gateways develop at the meso-scale as data repositories, such as the Global Biodiversity Information Facility (GBIF) and GenBank.At the macro-scale, recommendations about systemic solutions are meant to address large-scale issues.

Conclusions
This paper discusses standard-making as one part of the process of standardization for scientific metadata.A participatory framework is conceptualized and characterized by the expectation of ongoing collaboration.The framework highlights how 'a standard' is only one aspect of a complex process; standards alone do not inherently achieve standardization.Data reuse and automated processing rely not on metadata shoehorned into ill-fitting standards to meet minimum requirements, but rather on the advancement of science through collaborative learning at many levels and in many spheres-of-context.In both the scientific arena and with metadata standards, a participatory approach is essential for supporting scientific inquiry.
The development of multiple standards is an important part of research and is by no means a failure to solve the problem of metadata development in the natural sciences.The cyclical standard-making process illustrates that divergence due to local interpretation and enactment is as important as the convergence that removes barriers to standardization.Both are integral to the evolution of scientific metadata standards and to scientific inquiry.Further study of standards enactment, the roles of best practices and working standards, and standardization across levels is needed.
Given the infinite variety of ways to organize and classify digital data that represent observations of our world, participatory development of scientific metadata standards is required to capture the complexity of natural systems.Bringing experts from different arenas together through a participatory framework supports the research needed to answer grand challenge science questions of our day.In its provisionality, the process of standardization parallels the ongoing nature of the scientific knowledge-making process itself.

Figure 1 .
Figure 1.Hierarchical framework for developing standards where boxes a and b represent participants developing the standard and the circles represent organizational units.
e. 'what The International Journal of Digital Curation Volume 8, Issue 1 | 2013

Figure 2 .
Figure 2. Participatory framework for developing standards, with points of divergence (interpreting) and convergence (negotiating) shown as part of a cyclical process.

Table 1 .
Characteristics of two standard-development frameworks.