New trends and ideasA survey of the use of crowdsourcing in software engineering
Introduction
Crowdsourcing is an emerging distributed problem-solving model based on the combination of human and machine computation. The term ‘crowdsourcing’ was jointly1 coined by Howe and Robinson in 2006 (Howe, 2006b). According to the widely accepted definition presented in the article, crowdsourcing is the act of an organisation outsourcing their work to an undefined, networked labour using an open call for participation.
Crowdsourced Software Engineering (CSE) derives from crowdsourcing. Using an open call, it recruits global online labour to work on various types of software engineering tasks, such as requirements extraction, design, coding and testing. This emerging model has been claimed to reduce time-to-market by increasing parallelism (Lakhani, Garvin, Lonstein, 2010, LaToza, Ben Towne, van der Hoek, Herbsleb, 2013, Stol, Fitzgerald, 2014), and to lower costs and defect rates with flexible development capability (Lakhani et al., 2010). Crowdsourced Software Engineering is implemented by many successful crowdsourcing platforms, such as TopCoder, AppStori, uTest, Mob4Hire and TestFlight.
The crowdsourcing model has been applied to a wide range of creative and design-based activities (Cooper, Khatib, Treuille, Barbero, Lee, Beenen, Leaver-Fay, Baker, Popović, et al., 2010, Norman, Bountra, Edwards, Yamamoto, Friend, 2011, Brabham, Sanchez, Bartholomew, 2009, Chatfield, Brajawidagda, 2014, Alonso, Rose, Stewart, 2008). Crowdsourced Software Engineering has also rapidly gained increasing interest in both industrial and academic communities. Our pilot study of this survey reveals a dramatic rise in recent work on the use of crowdsourcing in software engineering, yet many authors claim that there is ‘little work’ on crowdsourcing for/in software engineering (Schiller, Ernst, 2012, Schiller, 2014, Zogaj, Bretschneider, Leimeister, 2014). These authors can easily be forgiven for this misconception, since the field is growing quickly and touches many disparate aspects of software engineering, forming a literature that spreads over many different software engineering application areas. Although previous work demonstrates that crowdsourcing is a promising approach, it usually targets a specific activity/domain in software engineering. Little is yet known about the overall picture of what types of tasks have been applied in software engineering, which types are more suitable to be crowdsourced, and what the limitations of and issues for Crowdsourced Software Engineering are. This motivates the need for the comprehensive survey that we present here.
The purpose of our survey is two-fold: First, to provide a comprehensive survey of the current research progress on using crowdsourcing to support software engineering activities. Second, to summarise the challenges for Crowdsourced Software Engineering and to reveal to what extent these challenges were addressed by existing work. Since this field is an emerging, fast-expanding area in software engineering yet to achieve full maturity, we aim to strive for breadth in this survey. The included literature may directly crowdsource software engineering tasks to the general public, indirectly reuse existing crowdsourced knowledge, or propose a framework to enable the realisation/improvement of Crowdsourced Software Engineering.
The remaining parts of this paper are organised as follows. Section 2 introduces the methodology on literature search and selection, with detailed numbers for each step. Section 3 presents background information on Crowdsourced Software Engineering. Section 4 describes practical platforms for Crowdsourced Software Engineering, together with their typical processes and relevant case studies. Section 5 provides a finer-grained view of Crowdsourced Software Engineering based on their application domains in software development life-cycle. Sections 6 and 7 describe current issues, open problems and opportunities. Section 8 discusses the limitations of this survey. Section 9 concludes.
Section snippets
Literature search and selection
The aim of conducting a comprehensive survey of all publications related to Crowdsourced Software Engineering necessitates a careful and thorough paper selection process. The process contains several steps which are described as follows:
To start with, we defined the inclusion criteria of the surveyed publications: The main criterion for including a paper in our survey is that the paper should describe research on crowdsourcing2
Definitions, trends and landscape
We first review definitions of crowdsourcing, before proceeding to the focus of Crowdsourced Software Engineering.
Crowdsourcing practice in software engineering
In this section, we describe the most prevalent crowdsourcing platforms together with typical crowdsourced development processes for software engineering. Since most case studies we collected were based on one (or several) of these commercial platforms, in the second part of this section, we present relevant case studies on the practice of Crowdsourced Software Engineering.
Crowdsourcing applications to software engineering
Crowdsourcing applications to software engineering are presented as multiple subsections, according to the software development life-cycle activities that pertain to them. The following major stages are addressed: software requirements, software design, software coding, software testing and verification, software evolution and maintenance. An overview of the research on Crowdsourced Software Engineering is shown in Table 6. The references that map to each of the software engineering tasks are
Issues and open problems
Despite the extensive applications of crowdsourcing in software engineering, the emerging model itself faces a series of issues that raise open problems for future work. These issues and open problems have been identified by previous studies. However, few research studies have focused on solutions to address these issues.
According to an in-depth industrial case study on TopCoder (Stol and Fitzgerald, 2014c), key concerns including task decomposition, planning and scheduling, coordination and
Opportunities
This section outlines five ways in which the authors believe Crowdsourced Software Engineering may develop as it matures, widens and deepens its penetration into software engineering methods, concepts and practices.
Threats to validity of this survey
The most relevant threats to validity for this survey study are the potential bias in the literature selection and misclassification.
Literature search and selection. Our online library search was driven by the keywords related to crowdsourcing and software engineering. It is possible that our search missed some studies that implicitly use crowdsourcing without mentioning the term ‘crowdsourcing’, or those studies that explicitly use crowdsourcing in the software engineering activities which
Conclusions
In this survey, we have analysed existing literature on the use of crowdsourcing in software engineering activities and research into these activities. The study has revealed a steadily increasing rate of publication and has presented a snapshot of the research progress of this area from the perspectives of theories, practices and applications. Specifically, theories on crowdsourced software development models, major commercial platforms for software engineering and corresponding case studies,
Acknowledgments
The authors would like to thank the many authors who contributed their valuable feedback in the ‘pseudo-crowdsourced’ checking process of this survey, and the anonymous referees for their comments.
Ke Mao is funded by the UCL Graduate Research Scholarship (GRS), and the UCL Overseas Research Scholarship (ORS). This work is also supported by the Dynamic Adaptive Automated Software Engineering (DAASE) programme grant (EP/J017515), which fully supports Yue Jia, partly supports Mark Harman.
Ke Mao is pursuing a PhD degree in computer science at University College London, under the supervision of Prof. Mark Harman and Dr. Licia Capra. He received the MSc degree in computer science from the Institute of Software, Chinese Academy of Sciences, China. He worked as a research intern and a software engineer intern at Microsoft and Baidu respectively. He has served as a publicity chair or a PC member for several international workshops on software crowdsourcing. He is currently
References (259)
- et al.
Motivation in software engineering: a systematic literature review
Inf. Software Technol.
(2008) - et al.
The apple business model: crowdsourcing mobile applications
Accounting Forum
(2013) - et al.
Babel pidgin: SBSE can grow and graft entirely new functionality into a real world system
Proc. 6th Symposium on Search Based Software Engineering
(2014) - et al.
Guide to the Software Engineering Body of Knowledge (SWEBOK®)
(2004) - et al.
Mutation testing using genetic algorithms: A co-evolution approach
Proc. 6th Annual Genetic and Evolutionary Computation Conference
(2004) - et al.
CrowdREquire: A Requirements Engineering Crowdsourcing Platform
Technical Report
(2012) - et al.
Evolving readable string test inputs using a natural language model to reduce human oracle cost
Proc. 6th IEEE International Conference on Software Testing, Verification and Validation
(2013) - et al.
ProtectMyPrivacy: Detecting and mitigating privacy leaks on iOS devices using crowdsourcing
Proceeding of the 11th Annual International Conference on Mobile Systems, Applications, and Services
(2013) - et al.
Alice in warningland: A large-scale field study of browser security warning effectiveness
Proc. 22Nd USENIX Conference on Security
(2013) - et al.
Crowdsourcing user interface adaptations for minimizing the bloat in enterprise applications
Proc. 5th ACM SIGCHI symposium on Engineering interactive computing systems
(2013)
Social adaptation: when software gives users a voice
Proc. 7th International Conference Evaluation of Novel Approaches to Software Engineering
Social sensing: When users become monitors
Proc. 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering
Quality control in crowdsourcing systems: Issues and directions
IEEE Internet Comput.
The design of adaptive acquisition of users feedback: An empirical study
Proc. 9th International Conference on Research Challenges in Information Science
Crowdsourcing for relevance evaluation
ACM SigIR Forum
Method-call recommendations from implicit developer feedback
Proc. 1st International Workshop on CrowdSourcing in Software Engineering
Proposing a system to support crowdsourcing
Proc. 2012 Workshop on Open Source and Design of Communication
Money, glory and cheap talk: analyzing strategic behavior of contestants in simultaneous crowdsourcing contests on TopCoder.com
Proc. 19th international conference on World wide web
Multi-objective improvement of software using co-evolution and smart seeding
Proc. 7th International Conference on Simulated Evolution and Learning
Crowdsourced web augmentation : a security model
Proc. 11 International Conference on Web Information Systems Engineering
Addressing JavaScript JIT engines performance quirks : a crowdsourced adaptive compiler
Proc. 23rd International Conference on Compiler Construction
Harnessing Stack Overflow for the IDE
Proc. 3rd International Workshop on Recommendation Systems for Software Engineering
A market-based approach to software evolution
Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications
Beyond Open Source: The TouchDevelop Cloud-based Integrated Development and Runtime Environment
Technical Report
The oracle problem in software testing: a survey
IEEE Trans. Software Eng.
Facilitating crowd sourced software engineering via stack overflow
Finding Source Code on the Web for Remix and Reuse
Social networking meets software development: Perspectives from GitHub, MSDN, Stack Exchange, and TopCoder
IEEE Software
Social media for software engineering
Proc. FSE/SDP Workshop on Future of Software Engineering Research
Crowd-powered interfaces
Proc. 23nd annual ACM symposium on User interface software and technology
Code hunt: experience with coding contests at scale
Proc. 37th International Conference on Software Engineering - JSEET
Repeatable and reliable search system evaluation using crowdsourcing
Proc. 34th International ACM SIGIR Conference on Research and Development in Information Retrieval
Software engineering economics
Crowdsourcing as a model for problem solving an introduction and cases
Convergence
Crowdsourcing public participation in transit planning: preliminary results from the next stop design case
Transportation Research Board
Scaling requirements extraction to the crowd: Experiments with privacy policies
Proc. 22nd IEEE International Requirements Engineering Conference
Reducing energy consumption using genetic improvement
Proc. 17th Annual Genetic and Evolutionary Computation Conference
IDE 2.0: Leveraging the Wisdom of the Software Engineering Crowds
IDE 2.0: Collective intelligence in software development
Proc. FSE/SDP Workshop on Future of Software Engineering Research
Crowdroid: Behavior-based malware detection system for Android
Proc. 1st ACM workshop on Security and Privacy in Smartphones and Mobile Devices
Crowdsourcing mobile web applications
Proc. ICWE 2013 Workshops
Crowdsourcing hazardous weather reports from citizens via twittersphere under the short warning lead times of EF5 intensity tornado conditions
Proc. 47th Hawaii International Conference on System Sciences
Who asked what: Integrating crowdsourced FAQs into API documentation
Proc. 36th International Conference on Software Engineering (ICSE Companion)
Crowd debugging
Proc. 10th Joint Meeting on Foundations of Software Engineering
Quadrant of Euphoria: a crowdsourcing platform for QoE assessment
IEEE Netw.
Puzzle-based automatic testing: bringing humans into the loop by solving puzzles
Proc. 27th IEEE/ACM International Conference on Automated Software Engineering
AR-Miner: mining informative reviews for developers from mobile app marketplace
Proc. 36th International Conference on Software Engineering
Software engineering for self-adaptive systems (Dagstuhl Seminar)
Dagstuhl Seminar Proceedings
Supporting users after software deployment through selection-based crowdsourced contextual help
LemonAid: selection-based crowdsourced contextual help for web applications
Proc. SIGCHI Conference on Human Factors in Computing Systems
Cited by (275)
The consolidation of game software engineering: A systematic literature review of software engineering for industry-scale computer games
2024, Information and Software TechnologyCrowdtesting Practices and Models: An Empirical Approach
2023, Information and Software TechnologyFalse negative of defects estimation in crowdsourced testing
2024, Journal of Software: Evolution and ProcessAPP constraint analysis approach to select mobile devices for compatibility crowdtesting
2024, Journal of Software: Evolution and ProcessSeverity-Oriented Multi-Objective Crowdsourced Test Reports Prioritization
2024, International Journal of Image and Graphics
Ke Mao is pursuing a PhD degree in computer science at University College London, under the supervision of Prof. Mark Harman and Dr. Licia Capra. He received the MSc degree in computer science from the Institute of Software, Chinese Academy of Sciences, China. He worked as a research intern and a software engineer intern at Microsoft and Baidu respectively. He has served as a publicity chair or a PC member for several international workshops on software crowdsourcing. He is currently investigating the application of crowdsourcing in software engineering, with a focus on crowdsourced software testing.
Licia Capra is Professor of pervasive computing in the Department of Computer Science at University College London. Licia conducts research in the area of computer-supported cooperative work. She has tackled specific topics within this broad research field, including crowdsourcing, coordination, context-awareness, trust management, and personalisation. She has published more than 70 papers on these topics, in top venues including SIGSOFT FSE, IEEE TSE, ACM CSCW, SIGIR, SIGKDD, and RecSys.
Mark Harman is Professor of software engineering at University College London, where he is the head of software systems engineering and director of the CREST centre. He is widely known for work on source code analysis and testing and was instrumental in the founding of the field of search-based software engineering, a sub-field of software engineering which is now attracted over 1,600 authors, spread over more than 40 countries.
Yue Jia is a lecturer in the Department of Computer Science at University College London. His research interests cover mutation testing, app store analysis and search-based software engineering. He has published more than 25 papers including one of which received the best paper award at SCAM’08. He co-authored several invited keynote papers at leading international conferences: ICST 2015, SPLC 2014, SEAMS 2014 and ASE 2012 and published a comprehensive survey on mutation testing in TSE. He has served in many programme committees and as program chair for the Mutation workshop and guest editor for the STVR special issue on Mutation.