Elsevier

Journal of Systems and Software

Volume 109, November 2015, Pages 50-61
Journal of Systems and Software

Scientific software development viewed as knowledge acquisition: Towards understanding the development of risk-averse scientific software

https://doi.org/10.1016/j.jss.2015.07.027Get rights and content

Highlights

  • Why scientists in risk-averse application domains are not “end-user programmers”.

  • Characteristics of scientists who develop software in risk-averse domains.

  • Presentation of knowledge acquisition software development model.

  • Why traditional development methodologies hamper development of scientific software.

  • Observations of how scientists develop software outside methods.

Abstract

This paper presents a model of software development based on knowledge acquisition. The model was formulated from 10 years of studies of scientific software and scientists who develop software as part of their science. The model is used to examine assumptions behind software development models commonly described in software engineering literature, and compare these with the observed way scientists develop software. This paper also explains why a particular type of scientist, one who works in a highly risk-averse application domain, does not conform to the common characterization of all scientists as “end-user programmers”. We offer observations of how this type of scientist develops trustworthy software. We observe that these scientists work outside the ubiquitous method-based software development paradigms, using instead a knowledge acquisition-based approach to software development. We also observe that the scientist is an integral part of the software system and cannot be excluded from its consideration. We suggest that use of the knowledge acquisition software development model requires research into how to support acquisition of knowledge while developing software, how to satisfy oversight in regulated application domains, and how to successfully manage a scientific group using this model.

Introduction

Scientific software development has been characterized as end-user programming (Segal, 2004), considered a candidate for Agile iterative development (e.g., Ackroyd et al., 2008), and has been regulated with waterfall-style software quality development standards (Canadian Standards Association). Scientists themselves characterize their development approach as “a-methodical” (Truex et al., 2000). This confusion of views of scientific software development hampers the creation of useful and useable tools, quality standards, and software development paradigms for scientists. Our aim in this paper is to (1) describe the common characteristics of the scientific software developer that we encountered in our studies, (2) argue that these scientists do not fall under the definition of “end-user programmers”, and based on our studies, (3) offer a different model of what drives software development by these scientific software developers, and (4) provide a new understanding of the software engineering research that would benefit this type of scientific software development and use.

For this paper, we define scientific software as application software that includes a large component of knowledge from the scientific application domain and is used to increase the knowledge of science for the purpose of solving real-world problems. We use the word “scientific” to include engineering applications.

Scientific software, by our definition, includes examples such as software to model loading on bridges, study safe operation of nuclear plants, track paths of hurricanes, locate satellites in telescope images, check mine shafts for rock faults, model medical procedures for cancer treatment, model dispersion patterns for toxic particulates, and study ocean currents for ecological impact.

The term “scientific software” has been used for a wide variety of software types that do not share the same quality requirements or the same management priorities. Software written to become a commodity product, for example, is managed to meet delivery dates and budget constraints. Software written to verify the safety of a radiation procedure, has to be correct, to the exclusion of all else.

We also exclude from our definition, software whose primary purpose is to control equipment. As explained in Kelly (2008), the quality goals of software that controls potentially dangerous equipment, such as avionics and nuclear reactor shut-down software, are different from the quality goals of scientific software that computes models of physical phenomena, such as tracking the path of a hurricane. If there is a failure in avionics software, the preference is that the software degrades as gracefully as possible. If there is a failure in software tracking severe weather, the preference is that it crashes and makes the problematic calculation as obvious as possible. One side effect is that any software quality standards targeted at control software are inappropriate to be applied to scientific software.

We also exclude generalized tools from our definition. Even if the tools are primarily intended for use by scientists, for example mathematical libraries, software layered to hide the complexity of high performance computing environments, and fourth generation languages intended for scientific computation. We include, instead, the applications built “on top” of these tools, which are aimed at solving a particular scientific problem.

To clarify, we further characterize scientific software with the following:

  • (a)

    a scientific domain specialist is necessarily involved in the process of developing the software;

  • (b)

    the user of this software has some minimum knowledge of the associated scientific domain, to allow correct interpretation of the output data;

  • (c)

    the user is the recipient of all output from the software, meaning the software's purpose is not to control equipment;

  • (d)

    the software's primary purpose is to provide data for understanding specific real world problems, meaning that the scientists we study do not develop generalized tools and libraries to support computational computing;

  • (e)

    the overriding software quality is correctness – or more accurately, trustworthiness – and if trustworthiness fails, then all other software qualities are irrelevant.

This paper is organized as follows:

The next section describes the set of studies of scientific software developers that we carried out from 2004 to 2014. This body of work provides the background for our understanding of the scientific software developer, and a basis for our discussions in the balance of this paper.

Next, we contrast our findings with the commonly held characterization of scientists as “end user programmers”. Ko et al. (2011) provide a definition and detailed discussion of the characteristics of “end user programmer”. We argue that end user programmer does not provide an accurate characterization of the type of scientist we are studying. Hence the body of research on end user programmers cannot be applied unilaterally to this type of scientist and their software.

In Section 4 of this paper, as an alternative to the view of scientific software developer as end user programmer, we offer a model of scientific software development based on the scientist acquiring knowledge from five knowledge domains. We are not proposing a new process theory, but an empirical example of an alternative to “methods”. We use our knowledge domain model to explain how approaches based on methods assume a fragmentation of knowledge that is detrimental to the development of scientific software.

In Section 5, we discuss, from our studies, the activities scientists engage in to advance their knowledge and to maintain trust in their scientific software, while not engaging in methods.

Finally, Section 6 concludes with a summary of the contributions of this paper.

Section snippets

2.1. Overview of a synthesis of research

From 2004 to 2014, we carried out a variety of studies looking at different aspects of scientific software development. In this section, we give a brief description of each study, a list of references that provide further details, and explain the findings salient to the discussion in this paper. The discussions in this paper are a synthesis of this work.

The studies took different formats from open-ended interviews of a group of scientists to working with and observing one scientist. Our

Scientists as professional end-user programmer

The most ubiquitous characterization of scientists who develop software is as end-user programmer. This allows the software engineering community to slot scientists into a body of research to help understand and recommend software engineering approaches to improve the scientists' software work. The most obvious reason to characterize scientists as end-user programmers is because they do not consider themselves to be in the software business. Segal (2004) refined the label to “professional

A model of knowledge acquisition as a driver for the development of scientific software

In order to fully understand how scientists view and develop their software, we need to change from a method-based view of software development where the product is the software, to a non-method view of software development where the product is the scientist's knowledge.

At least since 1990 (e.g., Guindon, 1990), researchers have considered how knowledge is acquired and expressed in any type of software. Earlier, researchers (e.g., Curtis et al., 1979) were aware that human understanding played

How scientists develop software outside the “methods” approach

Our observations (e.g., Kelly, November 2013, Sanders, 2008, Sanders and Kelly, July/August 2008), and that of others (e.g., Sletholt et al., 2012), are that scientists engage in software development outside the methods paradigm. But, because methods are so dominant in the software engineering literature (Ralph, 2012) and assumed by many to be the only valid approach to software development, scientists have been criticized for not following methods (e.g., Merali, 2010) and have been offered a

Summary and conclusions

None of the current software engineering views on how scientists do – or should – develop software are universally applicable to the wide and varied range of what is termed, scientific software.

The most ubiquitous view is that scientists are end user programmers. This is based on observations that scientists do not self-identify as professional programmers, that they do not produce software as end products, that their user base is small, and that scientists are not using “systematic and

Acknowledgment

This work is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC). Many thanks go to the scientists and engineers who offered their time and enthusiasm for these studies. Several talks given by the author based on these interviews were funded by IEEE.

Diane Kelly is an Associate Professor in the Department of Mathematics and Computer Science at the Royal Military College (RMC) of Canada. She is cross-appointed to RMC's Department of Electrical and Computer Engineering and to the School of Computing at Queen's University. Diane has a Ph.D. and MEng in Software Engineering both from RMC. Her B.Sc. in Pure Mathematics and B.Ed. in Mathematics and Computer Science are both from the University of Toronto. Diane worked in industry for over 20

References (37)

  • CrabtreeC.A. et al.

    An empirical characterization of scientific software development projects according to the Boehm and Turner model: a progress report

  • CurtisW. et al.

    Measuring the psychological complexity of software maintenance tasks with the Halstead and McCabe metrics

    IEEE Trans. Softw. Eng.

    (1979)
  • FloydB.D. et al.

    Simplicity research in information and communication technology

    IEEE Comput.

    (2013)
  • GrayR.C.

    Investigating Test Selection Techniques for Scientific Software

    (2010)
  • HookD.

    Using Code Mutation to Study Code Faults in Scientific Software

    (2009)
  • HookD. et al.

    Mutation sensitivity testing

    IEEE Comput. Sci. Eng.

    (2009)
  • JacksonM.

    Problem Frames

    (2001)
  • JorgensenP.C.

    Software Testing A Craftsman's Approach

    (1995)
  • Cited by (36)

    • An empirical study of COVID-19 related posts on Stack Overflow: Topics and technologies

      2021, Journal of Systems and Software
      Citation Excerpt :

      Besides the rapid demand for digitalized platforms elaborating the transformation of homes into places for remote education and work, COVID-19 has led to a steep growth in Scientific Software Development (SSD). Generally speaking, SSD refers to the design, implementation and testing of software encompassing knowledge from a specific scientific application domain (e.g. biology, health sciences, mathematics, data science etc.) and used with the primary aim of knowledge acquisition and solving of real-world problems (Kelly, 2015). According to Segal and Morris (2008) SSD is fundamentally different from commercial software since the (usually complex) application domain is not understood by the average developer and for this reason a scientist (domain expert) must be heavily involved in software development.

    • State of the Practice for Lattice Boltzmann Method Software

      2024, Archives of Computational Methods in Engineering
    • Computational Science: A Field of Inquiry for Design Science Research

      2022, Proceedings of the Annual Hawaii International Conference on System Sciences
    View all citing articles on Scopus

    Diane Kelly is an Associate Professor in the Department of Mathematics and Computer Science at the Royal Military College (RMC) of Canada. She is cross-appointed to RMC's Department of Electrical and Computer Engineering and to the School of Computing at Queen's University. Diane has a Ph.D. and MEng in Software Engineering both from RMC. Her B.Sc. in Pure Mathematics and B.Ed. in Mathematics and Computer Science are both from the University of Toronto. Diane worked in industry for over 20 years as a scientific software developer, technical trainer, and QA advisor. She is a senior member of IEEE.

    View full text