A Systematic Methodology to Migrate Complex Real-Time Software Systems to Multi-Core Platforms

This paper proposes a systematic three-stage methodology for migrating complex real-time industrial software systems from single-core to multi-core computing platforms. Single-core platforms have limited computational capabilities that prevent integration of computationally demanding applications such as image processing within the existing system. Modern multi-core processors o ﬀ er a promising solution to address these limitations by providing increased computational power and allowing parallel execution of di ﬀ erent applications within the system. However, the transition from traditional single-core to contemporary multi-core computing platforms is non-trivial and requires a systematic and well-deﬁned migration process. This paper reviews some of the existing migration methods and provides a systematic multi-phase migration process with emphasis on software architecture recovery and transformation to explicitly address the timing and dependability attributes expected of industrial software systems. The methodology was evaluated using a survey-based approach and the results indicate that the presented methodology is feasible, usable and useful for real-time industrial software systems.


Introduction
Software evolution has been a continuous process in industrial real-time embedded software systems with new functionality, performance improvements and bug fixes introduced with each new version, revision or release [1,2].Many of the industrial systems have been developed over the decades [3], undergoing major revisions due to technology shifts, changing customer requirements, improved development processes, among others.One constant factor associated with the evolution of such systems is that the software architectures and the implementations have focused on single-core computing platforms.Integrating new data-intensive and computationally demanding applications withing the system, however, requires additional computational capacity.Moreover, with the decreasing availability of the single-core processors, migrating the existing software to multi-core computing platforms is becoming a dependability requirements, we used a focus group dismode" [11].The different modes and the transition be-  the timing predictability of the overall software system.

Migration Methodology
Based on the reviewed methods and the extrafunctional requirements, we create a migration workflow as depicted in Fig. 2  1.During the first stage, we focus on the migration of software architecture.In this stage, the goal is to synthesize an abstract system model, validate its accuracy and transform the model for the multicore environment.
2. In the second stage, the implementation and verification migration, the goal is to analyse the system source code to identify potential concurrency issues within the code and transform the code according to the new multi-core architecture model.

Additionally. the existing verification techniques
are augmented with methods relevant for a multicore architecture.
3. In the third stage, we validate the migration process by identifying the validation parameters and measuring these parameters and then comparing them with the values obtained before migration.

Architecture Abstraction and Representation
In this phase, we seek to identify an abstraction level 347 that can accurately represent the system behaviour.An 348 abstraction level close to the implementation may be 349 too detailed, while a higher abstraction level can miss the possible modelling languages and frameworks that 387 can be used to represent the system under discussion.the different components of the system and the inter-for analysis becomes non-trivial for such cases.Further, since the controller software operates under different modes, the "maximum load" approach could be pessimistic.Depending on the system under migration, we will need to identify an appropriate configuration and analyse the run time behaviour of each mode independently.For the controller software considered, the "normal operation mode" had the highest resource demand and since all the other modes run only a subset of the "normal operation mode" tasks, we use the maximum load configuration of the "normal operation mode"and ensure that all the required system software components are active during the trace period.Note that we rely on the latest released version of the software.
During the run-time analysis of our system, we found that there were inconsistencies between the expected and observed behaviours.A few of the inconsistencies were a result of incorrect configuration of the instrumented code, while others were actual deviations from the expected behaviour.For example, the incorrect configuration resulted in the trace logs showing multiple instances of the jobs of a task as a single job of the same task.This observation highlights the fact that relying on a single source for information is not only ineffective but also error-prone.This necessitates the need for expert validation of the collected information to create a sufficiently accurate system model.
Expert Validation.Architectural design decisions are made by analysing multiple factors such as domain requirements, dependencies on services provided by the operating systems and the underlying hardware platform, among others.However, the high-level architectural models and documents do not describe the rationale behind the design decisions and even if they do, such information is limited.Moreover, in legacy systems, such documents do not completely reflect the implementation [36].Furthermore, as the information from the run-time analysis is quantitative and statistical in nature, it is possible to misinterpret any deviation from a commonly occurring pattern as an inconsistency whereas this could have been a design decision.
To avoid such misinterpretations and improve system model accuracy, discussions with domain experts are mandatory during the architecture analysis.These discussions will be used to understand the rationale behind the design decisions, and to validate the observations of the documentation and the run-time analysis phases.
In our work, we were able to validate the inconsistencies such as the deviation from a commonly occurring pattern as a design decision and also mark some of the observed results as an outcome of incorrect code instru-mentation configuration.For example, due to incor- Although we don't make any specific recommendations, we would like to point out that the number of potential solutions could be infinitely many and we hypothesize that evaluating each solution will be impossible.Especially in the case of real-time systems, where the search space in terms of near-optimal solutions is large [8, 9,38,39].Therefore, a good starting point in this stage are the domain experts.Also, the information from the architecture abstraction and recovery phases can be a useful guide in reducing the search space.In our case, we use expert interviews and review the stateof-art in the real-time systems domain to identify potential solutions.Another important consideration is that since application developers are focused primarily on the application functionality, they rely on the operating systems to provide support for real-time properties.
This implies that in many cases, only those mechanisms supported by an operating system can be considered as part of the potential solution set.
As highlighted earlier, the purpose of an abstract system model is to capture all the relevant properties of the system but without the functional complexity.This enables creation of synthetic tasks for simulation and verification of new design solutions.These abstract task sets can be modified and verified in short time spans when compared to modification of the actual implementation of the system.Many of the real-time workload models such as those reviewed in [21] have been successfully used to represent practical systems such as in the avionics domain as well as in the automotive domain.While many of these workload models consider the tasks to be independent, we found that the system under study violates this assumption and that new jobs of tasks are triggered by jobs of other tasks.Also, the presence of event triggered components within the system along with multi-rate task chains implementing a single functionality, requires that the precedence constraints as well as task chains be considered when considering potential solutions [30].
Some of the relevant issues that should be addressed by the potential solutions for transitioning from single core to multi-core platforms were highlighted by can remove any ambiguity associated with the perceived

Implementation
Once all components have been identified for modification and new components created, the necessary changes are implemented in the source code.Although the concurrency related issues are addressed during the architecture transformation phase, it is possible that they could manifest during the implementation stage.Therefore, coding guidelines that address these issues are provided to the developers to minimise the manifestation of these issues during the implementation.

Verification Migration
The system verification and validation stage is the final stage of the migration process.Typically, for the system such as the one being considered, a reliable verification process is already in place.This includes the usual verification approaches such as unit testing, functional testing, and system integration tests.Since the architectural transformation is primarily related to the runtime behaviour and performance, we expect that most, if not all existing tests related to functional behaviour to be valid.Therefore, we hypothesise that any failures here could be related to the concurrent execution of the system tasks.To maintain the quality of the system software, we focus on augmenting the existing tests with concurrency related testing approaches along with performance verification.Again, to approach this enhancement in a systematic way, we divide the verification migration process into concurrency testing and the migration validation phase.

Concurrency Testing
The goal during this phase is to augment the existing verification process to identify concurrency related issues.These include race conditions, atomicity violations and deadlocks.A comprehensive review can be found in the work by Bianchi et al. [46].We propose the analysis of solutions during the architecture transformation phase to identify scenarios that could lead to potential concurrency issues.This way, it will be possible to create tests for those specific scenarios.Additionally, static code analysis that identifies concurrency bugs is added to enhance the verification process.

Migration Validation
During this phase, we focus on validation of the migration process itself.We begin by identifying the parameters to qualitatively validate the outcome of the process.We use two metrics for this purpose: (i) results of the functional and system integration tests, and (ii) performance related parameters such as response times.

913
In the first case, no new failures should be introduced   The respondents were requested to read about the pre-991 sented methodology before they answered the survey.

992
The received responses were then analysed to evaluate 993 the methodology.rather big, and therefore we need to address the migra-on the application, the abstraction level and the modelling requirements will depend on individual applications.From the usefulness perspective, the responses show that following the methodology steps can decrease the risks associated with the migration.From the Generalisation perspective, the response show that the observations made in the methodology can be extended to systems other than the robotic system considered, while highlighting the fact that it may not always be possible to describe the timing properties for all of the applica-

126Figure 1 :
Figure 1: Main Modes in the System and apply the Analyze, Verify, Transform and Validate approach to this workflow.Essentially, during analysis, the requirements for the migration process are established and the existing system behaviour is recovered.Then the results of the analysis are verified by the subject experts.New solutions are identified and evaluated during the transformation phase.Finally, the applicability of these solutions, along with the migration process, is validated during the validation phase.Additionally, we consider the migration process to be iterative in the sense that each stage can be revisited and decisions can be roll-backed or modified to address issues that may have been missed or if they do not meet the objective of the migration.A brief overview of the different stages of the proposed workflow is as follows:

Figure 3 :
Figure 3: Various phases in the software architecture migration.

3335. 1 .
Architecture Requirements Specification 334 The architecture requirements specification is the 335 first phase of the architecture migration process.The 336 requirements are essentially high-level and the extra-337 functional requirements of scalability, performance and 338 timing guarantees are the guiding principles for the 339 complete migration process.The more concrete re-340 quirements are defined during the architecture recov-341 ery phase of the migration process.We also include the 342 identification of a requirements specification and man-343 agement process in this phase to better manage the re-344 quirements for the rest the migration process.

388
Fig. 5.A software component communicates with other 409

Figure 5 :Figure 6 :
Figure 5: Properties of a Software Component.

621630
rect configuration of the code instrumentation library, 622 the periodicity of the TS RPI observed during run-time 623 analysis phase did not match the values expected by the 624 experts.the functional behaviour however, was accu-625 rate, prompting a separate analysis.This analysis identi-626 fied incorrect configuration of the code instrumentation 627 as the root cause for observed deviation in the periodic-As discussed earlier, the architecture transformation 631 phase focuses primarily on evaluating potential solu-632 tions and identifying the most appropriate ones for the 633 final implementation.Before we evaluate any solution, 634 we need to identify the system requirements that need to 635 be considered to identify, evaluate and qualitatively rank 636 possible solutions.Since in our case, the migration to 637 multi-core will primarily affect the runtime behaviour, 638 we focus on the explicit temporal requirements, implicit 639 requirements such as the number of messages in a queue 640 and assigned QoS levels to different functional domains.641 An important requirement here is to ensure that this 642 transformation results in improved system predictabil-643 ity, performance and that the architecture is scalable in 644 terms of the number of cores and new functionality that 645 needs to be integrated into future versions of the soft-646 ware.Since the terms predictability, performance, and 647 scalability are generic in nature, we need to ensure that 648 we have measurable definitions for these terms.For ex-649 ample, we use scalability to refer to the capability of the 650 controller software to control more than one manipula-651 tor on the same hardware platform.Once we define the 652 evaluation criteria, we then move towards the evaluation 653 process itself.The evaluation can be carried out in var-654 ious ways depending on the evaluation metric and the 655 solution being considered, such as simulation, model-656 checking and analytical calculations.Once the evalua-657 tion of possible solutions is complete, we rank these so-658 lutions based on an agreed evaluation metric and based 659 on these rankings, we select the solutions for the final 660 implementation phase.To ensure that this transforma-661 tion is systematic, we divide the transformation phase 662 into the following steps: Identification of potential solutions.Identification of potential solutions can be done in many different ways.

914961
after the migration.In the second, the values of the per-915 formance parameters should not be less than those mea-916 sured with the pre-migration version.We point out here 917 that although the validation is the last step, depending 918 on the development process, this validation can be ap-919 plied to each build prior to release.By using the results 920 of the validation with each build, the pace of the migra-921 tion process can be measured.922 8. Tools for Migration 923 Software migration from single-core to multi-core ar-924 chitectures is a complex process and requires the use of 925 different tools at different stages of the migration pro-926 cess.Here, we review some of the tools that can be used 927 during the different phases of the migration process.9288.1.Architecture Representation 929 Software requirements and the architecture can be de-930 scribed in natural language and as models using differ-931 ent modelling languages such as the UML.For embed-932 ded systems with timing requirements, there exist many 933 tools that allow modelling and specification of different 934 views of the system.The APP4MC tool 6 , allows mod-935 elling and specification of the hardware as well as soft-936 ware components and provides support for scheduling 937 algorithms.Another tool is the MARTE [47] profile for 938 UML.The MARTE profile extends the UML models to 939 include description of timing requirements.The MAST 940 tool-suite 7 allows for modelling as well as performing 941 automatic schedulabilty analysis and supports many of 942 the common scheduling algorithms for single-core as 943 well as multi-core architectures.UPPAAL [25] is an-944 other tool for modelling the software as timed-automata 945 and it supports model checking for formal analysis and 946 verification.A few concerns with many of these tools 947 are that some have steep learning curves, while others 948 such as UPPAAL are not scalable to large systems and 949 almost all lack support for automatic conversion of ex-950 isting source code to abstract models.951 8.2.Architecture Recovery 952 For architecture recovery, static code visualization 953 tools such as CodeSonar and Imagix could be used.For 954 dynamic analysis, tools which provide visualization of 955 the run-time behaviour along with statistical informa-956 tion on timing properties can be effective.For exam-957 ple, Tracelyzer allows visualization of the run-time be-958 haviour and provides different views to analyse this in-We chose a survey-based approach to evaluate the 962 proposed methodology.We followed the guidelines 963 provided by Kitchenham et al. [48] for survey-based re-964 search and the discussion of the results.We begin by 965 describing the design of the survey and then discuss the 966 results of the survey.

967 9 . 1 .
Survey Design 968 As a first step in the survey-based evaluation, we 969 identified (i) feasibility, (ii) usability and, (iii) use-970 fulness as the evaluation objectives for the migration 971 methodology.Next, we identified the target population 972 for the evaluation to be those organisations that develop 973 complex real-time software systems such as industrial 974 automation systems and construction vehicles.We iden-975 tified a sample from the target population in a non-976 probabilistic manner through convenience and judge-977 ment based sampling.We created the survey instrument 978 in the form of online questionnaire that included both 979 close and open ended questions.The close ended ques-980 tions were designed to verify the generalisation of the 981 observations and the applicability of the different steps 982 in the methodology.The open ended questions required 983 the respondents to provide their opinion in a textual for-984 mat on feasibility and usefulness of the methodology.985 The complete questionnaire was piloted by requesting 986 colleagues not involved in the study to ensure clarity of 987 language before it was shared with the respondents.The 988 questionnaire was made available digitally and included 989 a brief overview of the purpose of the questionnaire. 990

994 9 . 1 . 1 .
Evaluation Objectives 995 As previously mentioned, we identified three key ob-996 jectives for the evaluation, namely feasibility, usability 997 and usefulness of the methodology.For each of these 998 objectives, we adopt the definitions used by Adesola 999 et al. [49] to evaluate their business improvement pro-1000 cess methodology.Briefly, we use feasibility to imply 1001 that all the steps in the methodology can be followed in 1002 practice.We use the term usability to refer to the ease 1003 of applicability of the methodology steps and the tools 1004 mentioned therein.We use usefulness to refer to the 1005 outcome of applying the methodology to relevant sys-1006 tems by an organisation.Furthermore, we also included 1007 the objective of validating the possibility of generalising 1008 key observations in the methodology.

1009 9 .
1.2.Target Population and Sampling Strategy 1010 To address the evaluation objectives, the target popu-1011 lation was identified as organisations developing com-1012 plex real-time systems.As for the sample, we iden-1013 tified 2 different departments within the same organi-1014 sation working on independent and unrelated products 1015 and also two other organisations.We then identified 9 1016 expert practitioners from the sample group as the most 1017 relevant for the evaluation.The participants were cho-1018 sen based on their experience in managing and develop-1019 ing software(10+ years) for industrial systems and for 1020 background in multi-core technologies and their knowl-1021 edge of the application domains.

1022 9 . 1 . 3 .
Instrument Design 1023 The survey was designed in the form of a question-1024 naire, combining nominal, close-ended questions, and 1025 the open-ended questions requiring textual input from 1026 the respondents.The questionnaire was designed to ad-1027 dress two different aspects, (i) problem relevance and 1028 (ii) methodology evaluation.For the problem relevance, 1029 we developed six questions to verify if the respondents 1030 were considering multi-core platforms for their prod-1031 ucts.The rest of the questionnaire was focused on 1032 methodology evaluation.We classified the evaluation 1033 related questions as either implicit or explicit.The im-1034 plicit questions required the respondents to reflect on 1035 the overall feasibility, usability and usefulness of the 1036 methodology.The explicit questions were designed to 1037 validate the generalisation of some of the observations 1038 made in the methodology.Table 2 shows the mapping 1039 among the different steps of the methodology, the eval-1040 uation type for each of the step and the associated ques-1041 tion IDs.Appendix A.3 shows the questionnaire.10429.2.Survey Results and Discussion 1043 As mentioned previously, the questionnaire was 1044 shared with nine carefully identified participants from 1045 the sample population.Of the nine participants invited, 1046 five respondents participated in the survey.We use the 1047 labels A,B,C,D and E to refer to each of the respondent 1048 individually.We discuss the results for the objectives 1049 of problem relevance, generalisation, overall feasibility, 1050 overall usability and the overall usefulness.1051 1052 spective, 4 of the 5 the respondents, (A,B,C and E) said 1053 that their applications were not designed for multi-core.1054 Respondent D said that their applications were designed 1055 for multi-core but they have been developed from the 1056 scratch with only limited reuse of existing code.Re-1057 spondents C and E confirmed that they are planning 1058 to migrate to a multi-core platform while the rest of 1059 the respondents did not provide any information.Ad-1060 ditionally, the same four respondents chose the option 1061 of redesigning the application while reusing the exist-1062 ing code over developing the application from scratch.1063 The responses indicate that migration to multi-core plat-1064 forms is being considered in the industry and at the same 1065 time, the respondents prefer reusing the existing code 1066 over the development of the applications from scratch.1067 Generalisation and Feasibility.Since the methodol-1068 ogy was developed based on observations of one sys-1069 tem, we created the questionnaire to verify if the ob-1070 servations made in different steps can be generalised 1071 for other complex real-time software systems as well.

1072
This was done by asking directed nominal questions fo-1073 cused on architecture representation, architecture recov-1074 ery (runtime analysis and documentation), architecture 1075 transformation (ranking of solutions), and verification 1076 migration.For the architecture representation, the re-1077 sults indicate that only parts of the application can be 1078 described by timing properties such as worst-case exe-1079 cution times, periods and deadlines.1080 Similar to the observations about lack of information 1081 in the documentation, 4 of the 5 the respondents, (A,B,C 1082 and E) said that the application design was not fully doc-1083 umented.Further, only one respondent said that the tim-1084 ing properties were discussed in the design documenta-1085 tion while the rest of the respondents said that the timing 1086 properties of only a few critical parts of the application 1087 were discussed in the documentation.1088 The methodology relies on the presence of diagnos-1089 tic information such as execution times and periodicity 1090 for architecture recovery.All the respondents said that 1091 their systems provide such diagnostic information.Fur-1092 thermore, all the respondents mentioned that their ap-1093 plications had multiple configurations and that the run-1094 time behaviour depended on the configuration.None of 1095 the respondents said that they tested all possible config-1096 urations but only a few.Four out of five respondents 1097 (A, B,C and D) said they tested average-case configura-1098 tions.Furthermore, respondents A and E said that they 1099 test the worst-case configurations while respondent D 1100 the worst-case configurations.
Threats to Validity.Since the evaluation of the methodology has been carried out using a survey, we include a discussion on the validity of the results.Kitchenham et al.[48]  advocates that a survey is reliable if it has been administered multiple times and if we get similar results each time.In our case, the survey was administered only once.This implies that the results may vary if the respondents were to answer questionnaire at different times.However, much of the questionnaire had nominal questions and the number of options provided were binary but with an additional option to provide textual information thereby limiting the possibility of variability in the responses.Furthermore, although the sample group was carefully chosen in a non-probabilistic manner, it is possible that a different sample of respondents may have provided different responses, affecting the validity of the conclusions drawn from the survey results.While the survey included questions relating to generalisation of the observations, not all of the methodology steps were explicitly considered but were included under the general questions of overall feasibility, usability and usefulness.Explicit questions may have lead to a different conclusion from the one discussed in the paper.10.ConclusionMigration of complex embedded software from single-core to multi-core computing platforms is nontrivial.To ensure a successful migration of these software systems, a systematic approach is needed that takes multiple software engineering perspectives into account such as software processes, software architectures, requirements engineering, reverse engineering, model-based development, real-time scheduling and schedulability analysis.In this paper, we presented a systematic multi-stage methodology for migrating realtime industrial software systems from single-core to multi-core computing platforms.In this regard, we studied a complex real-time software system from the au-tomation industrial domain that requires such a migra-1250 tion.We used focus group discussions, expert inter-1251 views and reviewed the literature to guide the develop-1252 ment of the migration strategy.We identified the soft-1253 ware architecture transformation as the main phase in 1254 the migration process and presented a systematic ap-1255 proach to perform the transformation with emphasis on 1256 the architecture recovery and an evaluation mechanism 1257 for possible multi-core solutions.We used task-level ab-1258 straction of the system to drive the transformation and 1259 associated timing properties to task-level models and 1260 proposed their use as input for the evaluation of multi-1261 core solutions.To select suitable solutions from the set tems Symposium, 2016.on model-and component-based development of predictable embedded software, modeling and timing analysis of in-vehicle communication, and end-to-end timing analysis of distributed embedded systems.Within this context, he has co-authored over 135 publications in peer-reviewed international journals, conferences and workshops.He has received several awards, including the IEEE Software Best Paper Award in 2017.He is a PC member and referee for several international conferences and journals respectively.He is a guest editor of IEEE Transactions on Industrial Informatics (TII), Elsevier's Journal of Systems Architecture and Microprocessors and Microsystems, ACM SIGBED Review, and Springer's Computing journal.He has organized and chaired several special sessions and workshops at the international conferences such as IEEE's IECON, ICIT and ETFA.For more information see http : //www.es.mdh.se/staf f /280 − S aad M ubeen.

Table 1 :
Subset of the tasks in the Robot Controller.
).The trajec-208 tory generation functionality is realised with the tasks 209 TS IPL Path and TS IPL JointPath.Further, the con-210 troller software includes the system state manager tasks, 211 namely TS Sys Events and TS Sys Backup, that are re-212 sponsible for managing different system level signals 213 and generating events that define the behaviour of other 214 tasks.For example, the system state manager task can 215 observe a change in the state of the safety switch signal 216 and generate an event that will trigger a mode change 217 from normal operation mode to a fail-safe mode.218 3. Related Work 219 Software migration is usually carried out when adopt-220 ing a different architectural paradigm than the existing 221 one, such as changing the programming language [12] or when moving from native server deployments to 223 cloud-based deployments [13, 14].Sneed [15] proposed 224 a five-step re-engineering planning process for legacy

Table 2 :
Mapping among the different steps of the methodology, the evaluation type for each of the step and the associated question IDs.
This indicates that iden- 1114For the verification migration stage of the method-1115 ology, a key assumption is that the complex real-time 1116 systems such as the one discussed in this paper have a 1117 robust testing mechanism in place for verifying func-1118 tional correctness.All the respondents agreed that they 1119 do have such a mechanism in place.Further, all respon-1120 dents agreed that they will reuse the existing tests to ver-1121 ify the behaviour of the systems after migration, which 1122 is consistent with the assumptions made in the proposed 1123 methodology.1124Theresults of the questionnaire so far indicate that 1125 much of the observations can be generalised to other 1126 complex real-time systems.One key observation how-1127 ever, is that describing all of the application components 1128 with timing properties may not be possible.For the 1129 steps not discussed in generalisation, we address them 1130 from the overall feasibility perspective discussed next.1131OverallFeasibility.In order to validate the feasibility 1132 of the methodology, i.e., to verify if all the steps of 1133 the methodology can be followed, the respondents were 1134 asked to answer if they found the methodology feasible 1135 and to describe the rationale behind their choice.Four