OPTIMIZING SYSTEM-ON-CHIP VERIFICATIONS WITH MULTI-OBJECTIVE GENETIC EVOLUTIONARY ALGORITHMS

. Veriﬁcation of semiconductor chip designs is commonly driven by single goal orientated measures. With increasing design complexities, this ap- proach is no longer eﬀective. We enhance the eﬀectiveness of coverage driven design veriﬁcations by applying multi-objective optimization techniques. The technique is based on genetic evolutionary algorithms. Diﬃculties with conﬂicting test objectives and selection of tests to achieve multiple veriﬁcation goals in the genetic evolutionary framework are also addressed.


1.
Introduction. Verification of hardware designs such as system-on-chips (SoC) is a common bottleneck in every chip design project [2]. The difficulty with design verification arises not only from increasing design complexities, but other verification goals and constraints imposed by testing of the design itself. Current goal orientated test strategies emphasize verifications that are typically driven by a single coverage measurement, which measure the progress and comprehensiveness of testing.
To enhance SoC verification, besides test coverage, other verification objectives should be simultaneously optimized with one another to drive verification procedures such as test generations. This paper proposes multi test objectives driven SoC verification. The ability to cater for multiple goals simultaneously within the same verification process is facilitated by test generations that are based on multiobjective optimization using genetic evolutionary algorithms (GEA).
Our GEA test generation can be driven by several different criteria, each capturing specific verification goals. This exploits the parallelism available from multiobjective optimization; by amalgamating multiple test processes into a single GEA flow. Extending from conventional coverage driven verifications (CDV), using a feedback control strategy, the test generation can be directed by more than one coverage measure and other test objectives concurrently.
A number of techniques have been proposed for CDV, but so far, only single coverage objective verifications have been attempted. Nativ et al. [16] presented an integrated microarchitecture test generation and simulation framework. Fine and Ziv [10] devised CDV modelled as a Bayesian network, whereby learning algorithms are used to train the CDV process. Tasiran et al. [23] extended the observability based coverage method with random biased simulation and formal verification (FV) techniques to implement a CDV flow.
The GEA paradigm for CDV has been demonstrated by Corno et al. [8] for verifying processor based hardware designs. The GEA method has been applied at various levels of verifications and testing, from validating high-level software models of the hardware described in SystemC [18], to pre-silicon design verifications, and automatic test pattern generation (ATPG) physical testing [9]. In [5], our singleobjective GEA method was also employed for design verifications of SoCs. We have shown that this GEA based CDV method outperformed other test methods. In this paper, we extend CDV from the single goal domain to multi test objectives driven verifications.
The success of multi-objective GEA verification depends on how objectives are incorporated into the test generation flow and the test platform. When multiple objectives are simultaneously targeted, a number of these objectives may be conflicting. For instance, in verification, coverage and test size objectives compete against one another during the GEA process. This is because higher coverage requires excitation of more design functions, which result in greater test size. Vice-versa, a smaller test size usually restricts the extent of testing on the SoC, giving lower coverage. Besides developing optimization techniques for verification of hardware designs, another contribution of this paper is to propose and acquire solutions that address difficulties with conflicting objectives.
In multi-objective GEA test generations, tackling multiple conflicting objectives presents the most difficulties. Like [15,17,20], we consider multi-objective GEA hardware verification as an optimization problem, and we propose techniques for representing and manipulating multiple objectives in the GEA optimization. Current strategies for handling multiple objectives are classified into two categories: aggregation and Pareto methods [7].
Aggregation combines the fitness values of multiple objectives together into a single fitness value. A range of aggregation methods has been proposed, each with varying success. For example, fitness values can be added based on differing objective priorities assigned by the user, or simply summed together and averaged out amongst the number of objectives. Jakob et al. [14] assigned specially devised weights to objectives and summed their fitness values accordingly. Goal attainment combines fitness values depending on how objectives satisfy certain goals [24]. Whilst it is easy to apply, the downside with aggregation is that certain objectives may dominate the GEA optimization process, leading to other objectives acquiring low fitness.
Pareto optimizing strategies for fitness assignment and GEA test selection was proposed by Goldberg et al. [12]. Their approach was to sort and rank all test solutions into different Pareto optimal subsets of tests, known as fronts [21] and then perform test selections. Extensions of Goldbergs method have continued. These include ranking the solutions within each front based on the number of other solution it dominates [11]. In [1], Pareto optimal solutions are employed for multi-objective optimization in the modelling of welfare economies. Like [1], the difficulty with large Pareto processing of objectives is tackled in [3]. We tackle Pareto optimization of large processes such as SoC verification, but difficulties with conflicting objectives are also examined.
Other schemes that extend Pareto techniques are the strength Pareto approach [25] or methods that combine tournament selections. The drawback with these methods is that much larger test populations are needed to provide a suitable sample set of solutions [13]. Our multi-objective GEA approach extends current methods by combining both aggregation and Pareto optimal methods to exploit their advantages and compensate for each others shortcomings.
The remainder of the paper is as follows. The next section presents the problem statement and the multi test objectives driven verification technique. Section 3 describes how multiple conflicting test objectives are managed. The GEA test population selection method that optimizes multiple verification goals is described in Section 4. Experimental results are presented in Section 5, before the paper is concluded in Section 6.

2.
Overview of technique. The key to our multi test objectives driven verification technique is to encode the test generation as a multi-objective GEA process. Verification goals are posed as objectives of the test generation to be optimized concurrently according to the problem description below.
2.1. Problem formulation. In the context of semiconductor design verification, the multi-objective optimization problem is formulated as: where x is a test from X which represents the input solution test space of the test generation, f l (x) is the line coverage fitness function of the SoC when exercised by the test x, f t (x) is the toggle (i.e. binary signal) coverage fitness function, f c (x) is the conditional coverage fitness function, f s (x) is the function that evaluates the size of the test, and M is the maximum size realizable to hold a test for execution. In SoC verification, different types of coverage objectives measured between 0 to 1 are employed to examine the comprehensiveness of the testing process; whereby a value of 1 indicates full coverage.
To find a solution for P1, GEA was chosen. The test generation is encoded into the GEA process by (i) representing the test suite as the GEA population, (ii) assigning a test program to act as an individual chromosomal solution, and (iii) mapping the test building blocks to be the genome set that makes up each chromosome test individual.
The test building block genome consists of snippets of software functions [6]. Many different permutations of these snippets are selected, and then mutated and combined in a GEA manner to generate test programs for conducting SoC design verifications. GEA mutation involves randomising the internal run-time parameters of the snippets software functions, whilst crossover inter-mixes different sequences of snippets functions from one test program with another test program -additional details regarding these GEA operations are provided in [6]. Figure 1 shows the high-level flow of the multi-objective GEA test generation. The GEA process begins by creating an initial population of tests and measuring their initial fitness performance for each objective. Variation is conducted using these initial tests to create new test variants. The variation operations carried out on individual tests are to add, subtract, mutate, and replace new test building blocks to existing tests. Recombination variation combines two or more existing tests to create new tests.
Once variation has created the next population of tests, their fitness is evaluated in the same way as the initial test population. Fitness evaluation is conducted concurrently for all objectives. Using fitness results, population selection chooses which tests to retain for the next evolution cycle.
Population selection is the most critical stage of the multi-objective GEA flow because it must interpret multiple objectives. The selection scheme establishes the selection criteria by which tests are retained to optimize all objectives simultaneously. Preserving a diverse population is essential for the GEA process to continue evolving tests that caters for all objectives in subsequent evolutions. In Sections 3 and 4, we describe a novel selection scheme that classifies tests into different subsets according to conflicting objectives criteria.
Throughout the multi-objective GEA process, the optimization feedback loop (see Figure 1) allows for test cases with best fitness results to be fed back to the test variation phase. This controls generation of new tests that ensures required functionalities and critical areas of the SoC are continually verified.
In multi-objective GEA, the goal of the GEA process is to seek a set of optimized solutions that exhibit the best possible trade-off performance for all objectives. The successful creation of an optimized set of tests relies on tackling two sub-problems: (i) SP1: how multiple conflicting objectives are encoded and managed, and (ii) SP2: how diverse tests are selected during the test generation process.   : , where x 1 and x 2 are arbitrary tests applied to maximise (or minimise) both f 1 and f 2 simultaneously, then f 1 and f 2 are said to be in conflict against each other.
In the GEA optimization, such conflict objectives cause slow convergence to a solution.

3.2.
Partitioning conflicting objectives. The slow convergence of a GEA solution is due to one objective's fitness causing degradation of another objective. The method we propose to speed up the process is to partition the multiple objectives into subsets of objectives that explicitly conflict with one another.
This partitioning approach enables the aggregated fitness from each set of conflicting objectives to be considered directly, rather than taking on the entire set of multiple objectives all at once during the GEA process. It further reduces the likelihood of any one objective dominating another during optimization. Greater opportunities are also presented for objectives to be optimized from a Pareto optimal viewpoint, as described later in Section 4.
The partitioning of conflicting objectives into subsets is governed by their relationship with other objectives as described in sub-problem SP1. After identifying conflicting objectives, these objectives are grouped together whilst non-conflicting objectives are separated from each other. The process of segregating objectives into subsets is termed specialisation. Within each subset, objectives are considered specialised, to compete directly against other conflicting objectives in the same subset. Under the context of SoC verification, the definition for objectives specialisation is as follows.
Definition 3.1. Specialisation of objectives: A specialised subset of objectives contains objectives that are in conflict with one another. For multi-objective GEA test generation, the objectives are specialised into three conflicting objective subsets as follows: where f 1 and f 2 from sub-problem SP1 may be represented by any of the f l , f t , f c , or f s objectives fitness functions.
Remarks. (i) These objective subsets imply that the three coverage metric types for lines, toggles, and conditions do not conflict with one another, but each is in conflict with the test size objective. The three coverage metric types are not independent of one another, but optimizing one particular type of coverage does not adversely impinge on the fitness of other coverage objectives.
(ii) The slow convergence of optimal solutions for conflicting objectives in SP1 is improved by specialisation of objectives into different subsets. By grouping and only considering conflicting objectives within each subset, the GEA process gives higher priority toward optimizing these groupings of conflicting objectives quickly, before tackling the entire set of multiple objectives. Once these subsets of conflicting objectives are processed, the population selection phase (Section 4) can then proceed with selecting tests that perform best for each subset.
Note that the above definition and subsequent discussions in this paper are applicable in general to more than two conflicting objectives. In this paper, we focus on two conflicting objective functions at a time to simplify the discussion and illustration.

4.
Multi-objective test selection sub-problem. The second sub-problem to address when introducing multiples objectives into the GEA verification flow is how to select tests that optimizes multiple objectives simultaneously. Following on from the specialisation of conflicting objectives presented in Definition 3.1, we propose to conduct a three phase test population sort and selection using these subsets. The three phases are: (1) Pareto optimal sorting, (2) aggregate ranking, and (3) round-robin test selection. Figure 2 shows the overall flow of multi-objective GEA test selection with the three phases embedded. We now describe the three phases. Figure 2. Three phase multi-objective GEA test population selection 4.2. Phase 1: Pareto optimal sorting. In the Pareto based method, the goal is to evolve GEA solutions into a Pareto optimal test population. A Pareto optimal population contains the set of tests which are the optimization solutions such that additional improvement in any one objective requires trade-off in performance of other objectives [21].
The set of Pareto optimal tests are considered non-dominated. Tests are nondominated if they are not inferior to any other tests with respect to the conflicting objectives targeted. All Pareto optimal tests exhibit better performance in at least one objective, and attain at least the same performance for all other objectives. We now define Pareto optimality and Pareto optimal fronts in the context of multiobjective optimization.
Consider n solution points x 1 , . . . , x n in the solution space X, where the solution points in X is applied to maximise (or minimise) the set of m multiple objective functions f = (f 1 , . . . , f m ).

Definition 4.2. Pareto optimality:
A solution x ∈ X is said to be Pareto optimal if there does not exist any other solution x k ∈ X such that x k dominates x. That is, x x k ∀k ∈ {1, . . . , n} ∧ x = x k . Definition 4.3. Solution mapping function: Let p : P (X) → P (X) be a mapping function that identifies the set of Pareto optimal solutions to form a Pareto front according to Definitions 4.1 and 4.2, where P (X) is the power set of X.
Given A ∈ P (X) and B ∈ P (X) subsets, the function p is defined as B = p (A) such that ∀x ∈ B and ∀y ∈ A \ B, x y. That is, p identifies and populates B with non-dominated Pareto optimal solutions from A. To explain, we consider a two-objective optimization process of maximising the coverage metric (f 1 ) against minimising the test size (f 2 ). Figure 3 shows the Pareto plot where each solid dot represents a solution point. The Pareto optimal solutions are located toward the top left region of the graph. By joining the Pareto optimal solution points together in a line, a wave-like front is formed (refer to Front 1 in Figure 3). This Pareto optimal front dominates all the other solutions to its lower right.
The GEA procedure aims to steer the Pareto optimal front toward the top left portion of the graph. The best compromise solutions amongst the two objectives are displayed along the line forming the best Pareto front 1.
Besides one Pareto optimal front, multiple Pareto optimal fronts in a hierarchical order can also be plotted (i.e., Fronts 2 to 4 in Figure 3). By excluding the current best Pareto optimal set of solutions (i.e., in Front 1), the remaining subset of solutions can be re-examined to identify another new Pareto optimal set (i.e., Front 2) as per Definitions 4.1 to 4.4. This then acts as the next lower hierarchical Pareto optimal front. Further Pareto optimal set of solutions and fronts can be identified by repeatedly excluding the previous best Pareto optimal solutions and applying Definitions 4.1 to 4.4. This provides a series of Pareto optimal fronts as shown in Figure 3, whereby the first front represents the best Pareto optimal set. Identifying the series of Pareto optimized fronts facilitates a sorting process for the population, which is described in the next sub-section.
With Pareto optimization, certain characteristics are desired when creating the first (best) Pareto optimal front. These characteristics are: (i) The Pareto front for each GEA evolution should expand towards the upper left region as more evolutions are conducted to optimize the population.
(ii) The larger the expansion achieved by the Pareto front, the more diverse the population will be. Diversity is important because it provides greater sample of solutions to choose from.
In Figure 2, the test population is ordered to produce different Pareto fronts depending on which objectives were used to perform Pareto sorting. A test may perform well for a set of conflicting objectives but not effectively for another set of objectives. For example, tests x 1 , x 2 , x 3 and x 4 perform well for the L and C objective subsets, and are placed within the first front of B L and B C (where B* holds a bin of tests). However, x 1 and x 2 do not optimize the T objectives as effectively, and are placed in the second Pareto front of B T .
The test selection makes use of the Pareto fronts from each Pareto sorted bin to ensure best tests that cater for each objective subset are retained for further optimization in future evolutions. The output from each of these sub-GEA Pareto processes will be combined later in the test selection phase 3 to continue overall multi-objective optimization in subsequent evolutions.
By Pareto optimizing the test population into a hierarchical structure, each front within a bin captures a set of trade-off tests, separated into different levels. Each hierarchical frontal set of tests achieves different levels of compromised fitness for conflicting objectives. This allows the test selection process to choose tests from different levels, thus promoting population diversity.
where f m is either the line, toggle or conditional coverage fitness function, f s is the test size fitness function, and M is the maximum test size allowed. The set X is the input set of possible tests created by multi-objective GEA.
With (1), the test size objective fitness f s (x) undergoes a transformation such that a smaller test size provides greater contribution to the aggregated fitness value f a (x). For example, Figure 4 shows the sorting of tests from aggregate ranking for the subset L. The tests in every hierarchical Pareto front are ordered so that each test is given a ranking according to their aggregated fitness objective value calculated from (1). The larger the aggregated fitness value, the higher the ranking assigned to the test. Tests assigned with a higher aggregate ranking will have higher likelihood of selection in the test selection phase described next. Round-robin test selection for next population. In the final phase, the GEA test selection process makes use of the Pareto and aggregate sorted bins of tests to select tests for the next evolution population. Tests are selected in a round-robin manner, cycling between the three sorted B L , B T and B C bins of tests.

4.5.
Summary. Using the Pareto based test selection strategy, the highest aggregate ranked tests from the best Pareto optimal frontal set is selected preferentially from each bin. Therefore, the best trade-off tests as ordered by the sequence of Pareto optimal fronts are chosen. Given the aggregate rankings within each front, the best performing tests for all conflicting objectives are chosen first. Low performing tests for any objectives will not be selected from any bin, and is released from the GEA process.
By dividing objectives into different subsets and sorting tests within each bin according to various objectives, diversity in the test population is also enhanced.
The tests selected will vary greatly from an objectives optimization perspective. The resultant test population will be effective for optimizing the entire set of objectives.
The selection method also promotes fairness. A test is selected from each bin every iteration using round-robin sharing amongst these bins. Tests are given equal selection priority from each of the objective subset bins.
The goal of the Pareto and aggregation phase is to rank tests so test selection can be conducted according to different subsets and priority of certain objectives. It addresses a common issue (described in [22]) whereby ranking skews tests to focus on certain conflicting objectives only, but ignores other objectives. Our approach alleviates this problem by identifying subsets of conflicting objectives and managing these smaller groups of objectives individually. This reduces the possibility of objectives being overlooked for optimization if they were all considered together at once.
By employing round-robin selection to select tests from the Pareto fronts of each objective subset, diversity of tests that caters for both conflicting and non-conflicting objectives is maintained. Round-robin selection also reduces the likelihood of one objective dominating another.

5.
Experiments. The test generation method was implemented as a test generation tool. It is integrated into our verification platform [6] to create tests for the Nios SoC [26], an industry-standard programmable chip.
In order to calibrate the verification platform and select test generation operating parameters, an analytical technique based on Markov chain modelling of the GEA test creation procedure was undertaken. Our Markov analysis method [4] was employed specifically to model the GEA variation flow of the test generation. Test generation parameters were chosen to maximise the likelihood of desired test creation characteristics. Specifically, the test generation was configured to terminate after 30 evolutions, the parent to children population size was set to a ratio of 1:2 and restricted to 15 and 30 respectively, and the length of the test program snippets was limited by the memory capacity of the Nios SoC that can hold the test programs. The mutation, crossover and selection likelihood varies depending on the above Markov modelling in [4].
Our experiments are presented in two sections. Pareto and test diversity characteristics of the multi-objective GEA test generation process are examined in Section 5.1. Section 5.2 assesses the performance of our multi-objective test generation against other verification schemes.

5.1.
Multi-objective GEA test generation characteristics. Figure 5 shows the Pareto front plot of the toggle coverage and test size objective subset from our test generation process. The best Pareto fronts from each evolution are plotted for toggle coverage achieved against test sizes. Best Pareto front plots show the fitness trade-offs possible between coverage and conflicting test size objectives. While our discussion focuses on toggle coverage in Figure 5, we emphasize that similar plots and characteristics are prevalent for the line and conditional coverage with test sizes objective subsets. Figure 5 shows that tests achieve coverage enhancements whilst trying to contain test size. For instance, during early to mid evolutions, the best Pareto fronts rise upwards along the y-axis showing coverage gains, but smaller drifts to the right along the x-axis indicate test sizes did not increase rapidly.

Pareto characteristics.
This pattern continues up till the middle stages of the evolution process, because opportunities for optimization and uncovering of new test space are higher with low test size penalty. Toward later evolutions, optimizations to attain further coverage improvements whilst containing test sizes become increasingly difficult. Hence, the vertical y-axis gap between consecutive Pareto fronts becomes smaller.
Eventually, the GEA process reaches a plateau whereby no further coverage enhancement is possible using the current test building block genome and test population. 5.1.2 Test diversity characteristics. Pareto front length, slope and curvature characteristics at every evolution are all indicators of the types and range of tests that make up the test population. The larger the length and slope of the Pareto fronts, the more diverse the tests are.
In Figure 5 and similar plots for the other objective subsets, Pareto fronts are generally of sufficient size throughout the GEA process for a diverse test population.
The large expansion of the best Pareto front at each evolution indicates tests are diverse to exercise different types of SoC functions.
Besides Pareto front size, the Pareto front curvature characteristics reveal how effective the GEA process optimizes objectives. A high curvature implies the GEA process is optimizing all objectives to find the best trade-off between coverage and test size fitness. Ideally, if the GEA process is to optimize multiple objectives effectively, the Pareto front curvature should also be convex toward the top left region of the plot. As demonstrated in Figure 5, between evolutions 20 to 25, the Pareto front whose curvature is obvious and its peak aimed at this region indicates tests exists along the Pareto front for which the highest coverage and lowest test size are being actively evolved. From evolution 25 onwards, as test diversity starts to fall, the resultant test suite is unable to achieve the same rate of optimization for all coverage and test size multi-objectives.
The Pareto front characteristics and test diversity results show our test generation method conducted multi-objective Pareto optimization as required. The SoC was verified with tests that actively maximised coverage metrics whilst test sizes and verification resources were minimised.

5.2.
Multi-objective test generation comparison. In this section, the multiobjective GEA test generation is evaluated against four test creation techniques; two are based on single-objective schemes, and the other two are common methods employed in the verification industry. The techniques to evaluate against are: (i) SAGETEG [5], a single-objective GEA test generator; (ii) the µGP test generator [19], an assembler instruction based test generator that also employs singleobjective GEA; (iii) random constraint-biased test generations [6]; and (iv) applications driven test creations that is conducted manually.  Table 1. Note that one multi-objective test generation is conducted whilst the coverage, size and time performance is measured concurrently.
In terms of line and toggle coverage, multi-objective GEA clearly surpasses other test generation methods, whilst conditional coverage shows lower improvement. Conditional coverage is the most difficult metric to maximise, and the limitations in enhancing conditional coverage further is due to the available test building block genome, not our test generation method. Despite this, multi-objective GEA testing acquires up to 17% better conditional coverage than other test generations.
Comparing test size results, multi-objective GEA testing outperforms singleobjective GEA test generation of SAGETEG and µGP. Overall, multi-objective test size per averaged test is approximately 7% smaller than SAGETEG and 23% smaller than µGP. This demonstrates multi-objective test generation achieves greater or equivalent coverage using smaller tests.
In the random approach however, random test programs are created by stochastic means only, and do not increase in size significantly when it strives to attain coverage, hence their size is slightly lower than multi-objective GEA testing. On the other hand, GEA test programs evolve and grow incrementally using test building block genome in order to seek greater coverage, but without significant size penalty.
Examining the time results in Table 1, multi-objective GEA test generation is efficient with regards to test execution and overall verification time. Multi-objective GEA completes at least 40% quicker, achieving superior or equivalent coverage, when compared with SAGETEG, µGP and random methods. For manual application testing, the elapsed testing time comparison is not applicable because creating tests manually will naturally be longer than automation.
In summary, the coverage, test size and time results demonstrate the effectiveness and advantages of our multi-objective GEA approach, especially when test size resources constraints exists in a verification project.
6. Conclusions. This paper presented a multi test objectives driven verification technique using genetic evolutionary algorithms and multi-objective optimizations.
The key innovation with the multi-objective GEA method is realised by combining Pareto front sorting, aggregate ranking, and multi-phase divide-and-conquer GEA selection strategies. We demonstrated the multi-objective GEA test technique on the Nios SoC to maximise multiple coverage metrics and simultaneously minimise sizes of test programs executed. When compared with four other test generation methods, the multi-objective GEA scheme provided higher coverage across all coverage metrics, and utilised lower test sizes and resources.