Technical Debt Aware Estimations in Software Engineering: A Systematic Mapping Study

Context: The Technical Debt metaphor has grown in popularity. More software is being created and has to be maintained. Agile methodologies, in particular Scrum, are widely used by development teams around the world. Estimation is an often practised step in sprint planning. The subject matter of this paper is the impact technical debt has on estimations. Objective: The goal of this research is to identify estimation problems and their solutions due to previously introduced technical debt in software projects. Method: The Systematic mapping study (SMS) method was applied in the research. Papers were selected from the popular digital databases (IEEE, ACM, Scopus, etc.) using deﬁned search criteria. Afterwards, a snowballing procedure was performed and the ﬁnal publication set was ﬁltered using inclusion/exclusion criteria. Results: 42 studies were selected and evaluated. Five categories of problems and seven proposed solutions to the problems have been extracted from the papers. Problems include items related to business perspective (delivery pressure or lack of technical debt understanding by business decision-makers) and technical perspective (diﬃculties in forecasting architectural technical debt impact or limits of source code analysis). Solutions were categorized in: more sophisticated decision-making tools for business managers, better tools for estimation support and technical debt management tools on an architectural-level, portfolio approach to technical debt, code audit and technical debt reduction routine conducted every sprint. Conclusion: The results of this mapping study can help taking the appropriate approach in technical debt mitigation in organizations. However, the outcome of the conducted research shows that the problem of measuring technical debt impact on estimations has not yet been solved. We propose several directions for further investigation. In particular, we would focus on more sophisticated decision-making


Introduction
Today software is present in all industries worldwide. The Industry 4.0 [1,2] 1 or Internet of Things [3] concepts are based on software to operate and provide solutions. Agile methods were proposed to better handle inevitable changes [4].
Cunningham [10] introduced the technical debt term to describe shortcuts taken by soft- Figure 1. Technical Debt Landscape (inspired by [9]) ware engineers in order to deliver value on time. "A little debt speeds development so long as it is paid back promptly with a rewrite.. . . The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt." [10]. The number of software developers increases every year. That implies creating more code and more technical debt in the result. According to Google Trends 2 technical debt metaphor has been growing in popularity.
Software project features may be delivered faster to users, but the effects of taking technical debt (e.g., storing application data in a file instead of a database) will have to be addressed in the future. As stated by Fowler [11], technical debt can be taken intentionally or unintentionally. Along with technical debt there is a interests concept. Interests can be considered as "the extra maintenance cost spent for not achieving the ideal quality level." [12]. It is a metaphor for unpaid technical debt becoming more expensive to repay over time. Technical debt grows during the software development process as stated in [13].
Technical Debt Landscape ( Figure 1) was introduced by Kruchten et al. [9]. The landscape identifies mostly invisible area where potential problems affecting estimations exist. Mostly invisible items are hidden to everybody apart from software engineers. Other members of the project team are aware of them, but might not know the details. The authors state: Technical debt should not be treated in isolation from adding new functionality or fixing defects and The challenge is in expressing all software develop-ment activities in terms of sequences of changes associated with a cost and a value [9]. Software development teams should communicate the "technological gap" in effort estimation so mostly invisible parts are known to the managers and stakeholders.
Estimation is a process of rough calculation of how much time is needed to deliver business value related to the estimated task or feature. There is a number of techniques helping developers to provide more accurate estimation [14][15][16] (e.g., poker planning, smart use cases or bucket system). Some of them use the developer's experience in a project to consider technical debt impact on estimation accuracy.
Estimations are straightforward in well-specified projects. Development teams start from scratch and will introduce technical debt. As new features are implemented or as existing features are extended, the project's complexity increases. The problem with estimations becomes visible after the technical debt has been taken and has to be addressed. It may be expected that forecasting technical debt impact on a new or changed feature is more difficult in later development stages. Estimations are becoming inaccurate and one of the reasons is improper technical debt measurement. The problem has to be addressed.
The goal of this research was to conduct a systematic mapping study on technical debt in the context of estimations. A number of publications were collected, examined and categorized giving several directions for further research.
The paper is organized as follows: Section 2 presents related work. Section 3 defines research questions for this systematic mapping study (SMS). Research methodology and crucial details of the SMS protocol are described in Section 4. Section 5 shows study results with a detailed description. In Section 6 we interpret responses to the posed study research questions. Section 7 presents threats to validity, while in Section 8 we conclude the work and show directions of further research. A list of primary sources found in our SMS is presented before references.

Related work
The amount of produced software worldwide increases every year which in turn affects technical debt. A number of studies have been conducted to address the problem of increasing technical debt from various perspectives.
Fernández-Sánchez et al. [17] searched for elements required in the technical debt management. They came up with a list of 12 items that will support decision making in managing technical debt. Items are divided into three types:(T1) Basic decision-making factors, (T2) Cost estimation techniques and (T3) Practices and techniques for decision-making. The result of this article is a framework introduced to aid decision making in technical debt management.
Another research by Fernández-Sánchez et al. [18] covers available techniques and methods for technical debt management from a software architecture perspective. In their systematic mapping study authors discovered the impact of various technical debt types, like code technical debt, documentation technical debt etc. on architectural technical debt. The conclusion is that further studies on architectural debt from a more holistic approach are needed. Ribeiro et al. [19] provides a list of 14 decision criteria on which technical debt repayment can be prioritized. Authors conclude that none of the researched studies has performed an empirical evaluation. In the authors' opinion, this may indicate a low level of maturity in decision-making criteria itself.
Li et al. [20] in their mapping study on technical debt and its management identify a list of ten technical debt types and 29 tools used as technical debt management systems. They indicate, however, that only four tools are dedicated to technical debt management. The rest is adapted in various ways from other software development areas. They conclude that there is a need for more sophisticated and dedicated technical debt management tools and further research on technical debt management. More high-level studies should be conducted by the software engineering community.
In another related work, Behutiye et al. [21] analyse the concept of technical debt in Agile Software Development (ASD). A list of ten causes and five consequences of incurring technical debt in ASD was identified in the research. Authors also classified a list of technical debt management strategies in ASD. The research indicates the need for more tools, models and guidelines that support management of technical debt in ASD [21] and the role of architecture in ASD.
The financial aspect is considered by Ampatzoglou et al. [22]. Authors introduced a glossary of financial terminology and classification schema of financial approaches used in technical debt management. The publication also states that it is easier for developers to communicate with non-technical managers.
Systematic mapping study on identification and management of technical debt was conducted in [23]. Research enumerates strategies that have been proposed to identify or manage technical debt in software projects. The conclusion is that most of the strategies are new but they lack studies to evaluate their real applicability.
None of the mentioned publications addressed the problem of technical debt impact on estimations. The goal of our work differs from the other secondary studies in terms of the research perspective and scope. Our study focuses on understanding how task delivery estimation is affected by technical debt and what software development teams do to develop software according to plan.

Research objectives
The objectives of this study were to identify problems in estimations due to existing technical debt in software projects and collect ideas on how development teams try to overcome the problems. Following research questions were stated: RQ1: What are the problems for the development team during task estimation due to technical debt?
The purpose of this question is to confirm problem existence. Potentially it could be possible to identify groups of similar problems.

RQ2: What kind of solutions are proposed to mitigate the impact of technical debt on task estimation?
The purpose of this question is to collect the actions taken by development teams to reduce technical debt factor in estimations.

Research methodology
In software engineering, guidelines developed by Kitchenham et al. [24] and Petersen et al. [25] provide comprehensive instructions on how to conduct systematic literature reviews (SLR) and systematic mapping studies. They share some commonalities (e.g., related to searching and study selection). However, the difference between both approaches is that systematic literature reviews focus on synthesising the evidence and gaining a new knowledge, while systematic mapping studies [25] are focused on structuring the research area and creating an overview. Systematic mapping study was chosen as a framework for this research to answer the questions posed in Section 3.

Systematic Mapping Study (SMS) protocol
Our protocol defines the procedures we intended to use for SMS including the following steps: 1. Define study objectives and research questions 2. Define search query and digital source databases 3. Define publication selection criteria 4. Define inclusion and exclusion criteria 5. Conduct data extraction and assessment 6. Conduct data synthesis After trialling the specified processes, the final version of the protocol was agreed by both authors. The following sections are based on the processes defined in the protocol. However, it is worth mentioning that we have added an additional exclusion criteria (short summary reports) that was not mentioned in the protocol.

Search query
We performed a series of trial queries against electronic databases. In result the following search query was formulated: ("software") AND ( "technical debt" OR "change impact") AND ("estimation" OR "decision making" OR "management") Such a search query will find publications with a technical debt aspect in various contexts.

Digital source databases
Publication sources include all popular academic databases. The year 1992 was chosen as the timeframe limit since Cunningham published his paper at that time [10]. Studies from following digital source databases were included: -IEEE Xplore [

Inclusion/exclusion criteria
Search query defined in Section 4.2 returned a total number of 2003 candidate documents for primary studies set. The distribution of documents per source database is presented in Table 1.
Primary studies set contained many irrelevant publications, due to query search generic nature. Thus, following inclusion/exclusion criteria were applied to select only relevant studies.

Snowballing
The importance of the snowballing step in SMS is described in [31]. Backward snowballing was performed for this study. Papers found in snowballing were checked using the same inclusion/exclusion criteria list as primary papers. The snowballing technique found one additional publication.

Data extraction and assessment
Data extraction and assessment process focused on collecting evidence that can formulate an an-swer to RQs. All filtered publications were read in full. Microsoft Excel was used to record and organize the following data: title, source, citation eligible for RQ1 or RQ2 and publication type. The assessment was based on whether a study provides evidence to answer one of the RQs.

Initial research set
Initial research set consisted of 45 articles. After applying inclusion/exclusion criteria papers [S1], [S2] and [S3] was excluded. Decisions were discussed by both authors.

Rigor and relevance
We applied a checklist proposed by Ivarsson and Gorschek [32] to access rigor and relevance of the final dataset. The rating model consists of two perspectives to measure: rigor and relevance. Rigor refers to how an evaluation is performed and how is it reported. Relevance measures the industrial applicability in the usage context, used research method, subjects/users and scalability. Each item is scored by 0, 0.5 and 1 in rigor perspective and 0 or 1 in relevance perspective.
The first author rated the studies for quality assurance. The rigor and relevance scores distribution in our SMS is presented in Figure 2.
In order to review the selection agreement among the authors, a Kappa analysis [33] was performed. Seven randomly selected 4 publications were examined by the second author. Based on the selected sample Kappa value was calcu-

Final set of papers
We selected 42 publications, see Table 2 and the list of primary studies found in our systematic mapping, presented before references. 41 of the papers were filtered through digital source databases using search query presented in Section 4.2. An additional one was found during the snowballing process. Table 1 presents a distribution of publications per digital source databases and snowballing procedure. It is worth mentioning that case studies were the most popular publication types among the accepted primary studies.
At this point we assessed all evidence for eligibility and divided into two groups: Identified problem categories (G1 -addressing RQ1) and Identified potential solution categories (G2 -addressing RQ2). Groups would later provide potential answers to RQs accordingly. The next step was to synthesise the data.

Data synthesis
The purpose of data processing is to synthesize extracted data in order to answer RQs from Section 3. Data extracted in Section 4.5 was divided into two groups. Each group contains a number of categories that emerged from examined publications. Category names were deduced from clustering items in each group.
Each category has its description and several papers addressing a particular subject. Results of data synthesis are available in Table 2.

Problems in estimation due to technical debt (RQ1)
We gathered five categories of problems in user-story estimation due to technical debt: -Business pressure on delivery -11 papers (i.e., 44% of publications that identified problems) emphasised that business pressure was the key factor in estimations and therefore technical debt introduction. Hence, we think that this problem is widespread. In one of the studies, authors say: The participants commonly acknowledged that technical debt is essentially a balance between software quality and business reality [S6]. Authors list a number of reasons behind that statement: (1) being contractually obligated to deliver the system under a tight deadline, (2) meeting deadlines to integrate with a partner product before release,  [S22]. -Architectural-level technical debt visualisation tool -Seven publications indicate the need for a high-level technical debt monitoring tool. A tool that will have the knowledge about technical debt not only in separate system components but also between them and the system as a whole. Authors of one study stated: Making the architectural debt visible provides the necessary information for making informed decisions for managing the potential impact of rework over time [S21]. This issue is also mentioned by others: The lack of tool support for accurately managing and tracking architectural sources of debt is a key issue. . . [S15]. -Technical debt reduction in every sprint -Eight publications propose continuous technical debt reduction during every sprint. A related excerpt in one of the papers is as follows: one participant described a policy of allocating 5 to 10 per cent of each release cycle to addressing technical debt [S6]. -Code audit activity -Three papers ( [S6], [S19], [S12]) propose periodical and systematic code audit actions conducted by the development team. Authors of one of the studies conclude: . . . conduct audits with the entire development team to make technical debt visible and explicit; track it using a Wiki, backlog, or task board [S6] -Extra resources -Two papers propose adding extra resources such as people [S22], infrastructure or budget [S45] to the project. Such solutions may indicate a tight project schedule or an attempt to reaching the project deadline.

Discussion
The overall goal of this research was to identify problems, as well as proposed solutions occurring in estimations due to previously introduced technical debt. In this section we will present our interpretation of systematic mapping study results and their implications for academia and industry.

Problems in estimation due to technical debt
Business pressure on delivery and lack of technical debt awareness in management are related to the business perspective in a particular software project. The main purpose of building software is to support other processes. Managers and business officers are focused on growing the organization. Software support can give them a competitive advantage and that is why they force pressure on short software release cycles.
No procedures for technical debt management mentioned in three research papers indicate immature development process. This may be due to various reasons. Company owners may not be aware of the technical debt problem or may consider a particular project as a prototype where technical debt is not considered as a problem. On the other side, the project can be so big that introducing new development procedures is too cumbersome or too expensive. Finally, the development team may not know how to introduce such procedures.
Results such as architectural technical debt impact and source code analysis is not sufficient, can be interpreted differently. Those problems are more related to technical aspects. The architectural technical debt impact item is strongly bound to project evolution. For instance, the mainstream in web development is moving to cloud-based solutions and application containers providing better scalability and flexibility. Adjusting old software can be difficult and can be considered as a sample of architectural technical debt. Source code analysis is also not sufficient because engineers would adjust their code in such a way that it will pass the code analysis, but remind a poor quality.
Depending from which perspective we consider the situation different problems are present. In the worst-case scenario, all of them can occur in the organization and will slow down the development process even further.

Solutions to mitigate technical debt in estimations
Only one proposed solution focuses on non-technical stakeholders (tools for decision support). However, 38% of examined studies (14 of  Another interesting interpretation arises from portfolio approach (technical debt items), technical debt reduction in every sprint and code audit activity. All of those solutions can be concluded as a need for deeper software development processes standardization and/or regulations. In other industries like medicine, maritime, aviation or automotive rules and regulations according to which certain procedures have to be conducted do exist. In IT there is ISO 25010 standard, but it is not mandatory to implement it.
The findings indicate that "Time To Market" has the biggest impact on schedule and the decision to repay or not the technical debt. The software solutions are too complicated and cannot be adapted fast enough in a rapidly changing world. An interesting fact the study uncovers is that source code analysis tools are not sufficient to cope with technical debt in estimations.
Based on the information from the performed SMS, we recommend focus future research on various decision-support levels. The complexity of software solutions grows and it is more difficult to get an overview from both business and technical perspectives. We propose that such decision-support research should take in consideration software maintenance and evolution.

Threats to validity
A systematic mapping study is conducted by people and thus an inevitable risk is related to the bias that may come from the choice of search engines/digital libraries and of search terms that may favour finding some studies and perhaps missing others. Hence, an important threat to the validity of this SMS is related to the search strategy employed and the possibility that we have not identified all relevant papers. The completeness of the search depends on the search string used, the scope of the search in terms of selected search engines, as well as their limitations Brereton et al. [34]. For example, it is possible to extend the search query even further by adding additional words like "managing". We do not think this is a significant threat. Nevertheless, it is still possible that after such extension the result set of papers would be a different, but (in our opinion) to a minor extent. To reduce this threat we selected a range of digital libraries and thus widened its scope. We also used a known set of references to validate the search terms before undertaking the mapping study and the search terms were amended where necessary (e.g., we included "change impact" that we initially missed).
The time window chosen by us (since 1992 till now) can be seen as a threat. That said, we think that the knowledge about technical debt, software development and programming languages has evolved to such extent that we probably do not lose anything crucial excluding papers before 1992.
We also conducted snowballing to limit the possibility of missing relevant papers. Only one additional paper was identified by searching the references of included studies.
A closely related threat is that "grey literature" may not be found due to the nature of digital libraries used. Snowballing can be seen as a partial solution to limit this threat as references of the papers found in digital libraries may include "grey literature" as well.
It is also worth mentioning that categories synthesised from publications data extraction emerged from our best understanding of the topic. We proposed category names presented in Table 2 based on our experience in software engineering.
We limited the scope of our search to articles written in English. Thus the presented results can be biased by omitting publications written Table 3. Evaluation of our mapping process (see [25])

Rubric
Score Description Need for review 1 Partial evaluation -motivations and questions are provided. Choosing the search strategy 1 Minimal evaluation -two search strategies (automated database search and snowballing) have been used. Evaluation of the search 2 Partial evaluation -at least one action has been taken to improve the reliability of the search and the inclusion/exclusion. Extraction and classification 2 Partial evaluation -at least one action has been taken to increase the reliability of the extraction process, and research type and method have been classified. Study validity 1 Full evaluation -threats and limitations are described.
in other languages (e.g., Chinese). However, we based our research on the most popular language among software engineering researchers and practitioners.
A search-related limitation of this mapping study is that the search only covers publications that were included in the chosen digital libraries before January 2019. This date is related to the moment when the mapping study was performed. It is therefore probable (due to the fact that technical debt is perceived as an interesting topic) that a number of other relevant papers will have been published since this date that we have not included in this mapping study. However, this limitation is difficult to avoid and the common solution is to conduct a new search and/or snowballing to update the results of the mapping study.
Additionally, Table 3 presents an evaluation of our mapping process on a basis of the quality checklist rubric criteria (defined by Petersen et al. [25]) including: identifying the need for SMS, study identification, data extraction and classification, as well as validity discussion.

Conclusions
In this systematic mapping study, 42 out of 2003 relevant publications were selected. 41 from query search in five digital databases and one additional from the snowballing procedure. The contribution of this study is a categorisation of technical debt related issues in task estimations and proposed solutions to the issues presented in Section 5. Five problems and seven solutions identified in literature have been categorised. Furthermore, the majority of identified categories of problems and solutions include real-life examples describing industry cases.
The technical debt impact on task estimation is an important issue to address. Our SMS shows seven approaches to extend the current state of technical debt management. We conclude that the task estimation accuracy can be further improved in one of the following directions: -business direction -research on how the managers can gain more insight into the software system that is supporting their business. Understand the system's current limitations and the impact of new business decisions on it. That implies research on how software engineers can improve communication with "the business." -operational direction -research on software systems maintainability and development routines. That includes new ways of formalizing and structuring software components, data flows, integrations and others so that it would be easy to analyse new requirements impact on the software project. The problem of business pressure on features delivery has appeared in our findings on several occasions. Our further research will focus on decision-making tools. In our opinion, there is a room for improvement that will potentially help development teams to measure the impact of technical debt on estimations with more accurate precision.