An Elitist Non-Dominated Multi-Objective Genetic Algorithm Based Temperature Aware Circuit Synthesis

At sub-nanometre technology, temperature is one of the important design parameters to be taken care of during the target implementation for the circuit for its long term and reliable operation. High device package density leads to high power density that generates high temperatures. The temperature of a chip is directly proportional to the power density of the chip. So, the power density of a chip can be minimized to reduce the possibility of the high temperature generation. Temperature minimization approaches are generally addressed at the physical design level but it incurs high cooling cost. To reduce the cooling cost, the temperature minimization approaches can be addressed at the logic level. In this work, the Non-Dominated Sorting Genetic Algorithm-II (NSGA-II) based multi-objective heuristic approach is proposed to select the efficient input variable polarity of Mixed Polarity Reed-Muller (MPRM) expansion for simultaneous optimization of area, power, and temperature. A Pareto optimal solution set is obtained from the vast solution set of 3n (‘n’ is the number of input variables) different polarities of MPRM. Tabular technique is used for input polarity conversion from Sum-of-Product (SOP) form to MPRM form. Finally, using CADENCE and HotSpot tool absolute temperature, silicon area and power consumption of the synthesized circuits are calculated and are reported. The proposed algorithm saves around 76.20% silicon area, 29.09% power dissipation and reduces 17.06% peak temperature in comparison with the reported values in the literature.

path length using the Huffman tree construction algorithm.A shared mixed polarity RM (SMPRM) network is proposed using weighted search GA (WSGA) in [16] to find the optimal polarity based on area, power, and temperature.The trade-off analysis is also reported among the area, power, and temperature.But it is very difficult to find the optimum polarity in WSGA as one parameter may dominate the other.Other than [16], none of the articles have considered temperature as one of the cost metric for RM network synthesis.But consideration of temperature as a cost metric is very much essential because, ICs need to operate within a stipulated temperature zone prescribed by the manufacturers.The commercial devices and industrial devices are operating within the temperature zone of (0 °C to 70 °C) and (-40 °C to 85 °C) respectively, which are much lower than the aerospace and military devices operating zone (-55 °C to 125 °C) [17]- [18].Due to aggressive device scaling and package density, most of the integrated circuits (ICs) are burnt just because of over-heating.Overheating is build-up due to excessive power density generation within the chip for the inclusion of a vast number of complex functionality within a small silicon area.So far most of the researchers paid attention to physical design domain for temperature minimization [19]- [20], but the cooling solutions are rising at $ 1-3 or more per watt of power dissipation [21].The cooling cost of high-performance processor increases exponentially with the growth of power density.So, designtime thermal-aware techniques can be used to improve the power and thermal characteristics of integrated circuits.Few works report temperature minimization by reducing the power density at logic synthesis level [22]- [26].The power density finds a direct relation with temperature generation within a chip by the following expression [27].(1) In equation (1), T chip and T amb are the average chip temperature and ambient temperature respectively.Where, R th is the summative equivalent thermal resistance of the substrate (Si) layer, package, and heat sink (m 2 °C/W).Total power dissipation is represented by P T (in W).And A T (in m 2 ) referred as total silicon core area of the chip.This area does not include the package area, but it consists of all cell area and routing area of Silicon chip.Earlier researchers ignore the thermal issues in higher levels (logic synthesis and circuit design) of very-large-scale integration (VLSI) design synthesis.In [28], Shang and Dick reported that rise in chip temperature set back reliability, performance, cost and power consumption.It is reported in [28] that 30% cost of IC packaging is contributed by cooling arrangement.The temperature determining parameter inclusion in logic synthesis level may reduce the cooling cost.
An exact or exhaustive search method can be used for small-sized circuits, but this strategy is not feasible for middle or large-sized circuits.The problem of determination of exact input variable polarity for getting minimum cost is a non-deterministic polynomial-time hard (NP-hard) problem.No known algorithm can solve this problem in polynomial time.Non-exhaustive or heuristic search approaches have been introduced to solve such NP-hard problems.Detail of NPhard problems can be found in [29].The proposed work presents a fast converging heuristic technique called Non-Dominated Sorting based Genetic Algorithm-II (NSGA-II) for the thermal-aware problem.Compared to existing optimization approaches, the contributions of the proposed approach are as follows: • The thermal-aware AND-XOR logic synthesis is done suitably using MPRM expansion methodology.
• NSGA-II is used to get the optimum solution in terms of area, power, and power density for MPRM circuits.Parameters of the NSGA-II algorithm are tuned suitably to get the optimum solution.
• The simulation result of the proposed approach is reported by calculating the absolute temperature, total power consumption, and silicon area.'HotSpot' tool [30] is used to report the absolute temperature.Cadence 'Innovus' tool [31] is used to report the total power consumption (dynamic and leakage) and silicon area at 45nm technology.
In the proposed approach, we considered the ternary input variable polarity for chromosome encoding and then modified the NSGA-II approach at crossover and mutation level to find the better offspring.Parent chromosomes for crossover and mutation are chosen from the elite group or entire population based on threshold value.Twopoint crossover methodologies are used to generate the offspring chromosomes.Random bit positions are chosen to increase the mutation diversity within the offspring.In the proposed work, power density is considered as a cost metric to reduce the thermal effect in MPRM network.Finally, Electronic Design Automation tools (Cadence and HotSpot) are used for actual area, power and temperature calculation.
The rest of the paper is organized as follows: Section II demonstrates the motivation and basic terminologies used in RM expansion.Section III presents the Thermal-aware mixed polarity problem formulation using tabular technique approach.NSGA-II based thermal-aware realization is described in section IV.Section V details the results, and finally, section VI draws the conclusion.

A. Reed-Muller Expansion
Any n-input m-output Boolean function can be represented canonically as AND-OR based Sum-Of-Product (SOP) form.The SOPs are expanded with 2 n different product terms as shown below: (2) Where 'm i ' represents the minterms and p i ϵ {0, 1} represents the absence or presence of minterms.Suffix 'i' accounts for the number of terms which varies from 1 to 2 n .If all the input variables are used to represent the minterms of an expression, then it is said to be Canonical Sum-Of-Product (CSOP).In CSOP logic function all OR gates can be replaced with XOR gates and provides ExOR Sum-Of-Product (ESOP) function.The ESOP form can be represented as: (3) Here, ⊕ represent the ExOR operation.The expanded ESOP form can be written as: (4) Eq. ( 4) is also being represented as Reed-Muller (RM) expansions based on each variable appearance.Variables can be appeared as true form (x i ) or complemented form ( ) or mixed form (x i and ).

B. Fixed Polarity Reed-Muller Expansions
When each variable appears in true or complemented form but not both at the same time as shown in eq.( 4) is known as Fixed Polarity Reed-Muller (FPRM) expansion.FPRM expansion provides 2 n different polarities or expansions for a given problem.Example 1 demonstrates the formation of an FPRM expansion.
Example 1: Consider a Boolean expression with the function given by: (5) FPRM expansion polarities are defined with binary numbers as (6) If polarity (101) 2 is assigned to a function f 1 (x 3 , x 2 , x 1 ), then the variables x 1 and x 3 are expressed in true form, and variable x 2 is in complemented form by utilizing and respectively.
For the given polarity the FPRM expansion for function f 1 is given by: (7)

C. Mixed Polarity Reed-Muller Expansions
If each variable in eq. ( 4) is represented by true or complemented form at the same time, then this form of representation is known as Mixed Polarity Reed-Muller (MPRM) Expansion.3 n different polarities or expansions are possible in MPRM expansion.The 3 n polarities of MPRM expansion include 2 n polarities of FPRM expansion.Hence, probability of getting a better solution in MPRM is more than that of FPRM.Example 2 illustrates the formation of MPRM expansion.
Example 2: Example considered for FPRM expansion (in example 1) is taken to illustrate the MPRM expansion.(8) Ternary variable is used to represent the polarities of MPRM expansion.(9) If function f 1 (x 3 , x 2 , x 1 ) is encoded as (201) 3 , where x 1 is expressed in true polarity, x 2 is in complementary form, and x 3 is represented in mixed form, the MPRM expansion for function f 1 by given polarity is expressed as: (10) It is inferred from eq. ( 7) and (10) that judicious choice of input variable polarity in MPRM expansion can provide a better solution than the FPRM expansion.Nine (9) literals are required to represent the function given in example 1 using FPRM.Whereas, only seven (7) literals are sufficient to represent the same function using MPRM expansion.It is expected that the number of switching activity is also get reduced with the literal minimization.The next section describes the tabular technique implementation for MPRM thermal-aware problem realization.
III. Proposed Thermal Aware Mixed Polarity Reed-Muller Approach Using Tabular Technique

A. Area Computation
The thoughtful conversion of Boolean function into MPRM for maximum sharing of product terms considering the optimization parameters by efficient input variable encoding is carried out in this work.A multi-input multi-output Boolean function in the form of pla file is considered as input for the proposed synthesis process.The following steps illustrate the tabular technique implementation for MPRM thermal-aware problem realization.A brief description for polarity conversion procedure is given below.
All the terms present in the Boolean function are listed in Binary form.Don't care conditions are realized in true as well as complementary form to generate canonical representation.Input variables are encoded in mixed polarity, as shown by eq. 9.Then, the input functions are decomposed based on encoding.
Inter polarity conversion takes place according to the chromosome encoding.There can be any of the following three cases -• 2 to 2 conversion: When a variable is initially in mixed form, '2' and after conversion also the polarity of that variable is '2', then the variable is in mixed form.For such case, the bits of the corresponding variable remain unchanged.
• 2 to 1 conversion: When the variable is initially in mixed form, '2' and the final polarity of the variable is '1', i.e., the variable exists in true form in the final expression, for that all the '0's of the variable are to be replaced by '1' and thus a new term with don't care is generated in the table.
• 2 to 0 conversion: When the variable is in mixed form, '2' initially, and the final polarity of the variable is '0', i.e., the variable exists in complementary form, for that all the '1's of the variable are to be replaced by '0', and thus a new term with don't care is generated in the table.
After generating all the possible new terms for a single variable, they are to be compared with the existing terms to cancel out the similar terms and the table is updated.If two input cubes having same output, but the input is varied by only one literal then that literal is replaced by don't care symbol (-).In this way, steps are repeated for all the input variables in the function to get the reduced MPRM expression.An arbitrary Boolean function is considered as an example case and it is shown in Fig. 1(a).The chromosome encoding for the example case is shown in Fig. 1(b).The translation of input Boolean function and area computation using tabular technique is shown below.
The two output functions are: (11) (12) Generally, Boolean functions are expressed in terms of AND-OR function.As f 1 and f 2 are represented in disjoint cube form, so it can be represented as: (13) and, (14) Table I shows the input polarity conversion based on encoding, as shown in Fig. 1(b).Variable x and y are represented in mixed polarity form so, no new term is generated.But new term will be generated for z and w, where the variables are expressed in true and complementary form respectively.The redundant terms noted with a, b, c, d and e get eliminated, and the terms noted with f forms a new term by replacing one literal with don't care.
After polarity conversion, the final MPRM output for function f 1 and f 2 are represented as: Shared terms are: It is observed that primary function requires 8 product terms with 32 literals whereas, final function requires 7 product terms with 18 literals (where 3 product terms are shared among function f 1 and f 2 ).

B. Power Estimation Using Switching Activity
In CMOS circuits, the dynamic dissipation is the main contributor to power consumption, which is caused by charging and discharging the load capacitances.It can be modeled as: (17) Where, P dyn and P swt represent the dynamic and switching power respectively.α L and α i are the switching activity at the load and internal node respectively.The capacitance at the load and internal gates are represented by C L and C i respectively.Supply voltage, threshold voltage, and frequency of operation are given by V DD , V T, and f respectively.
Eq. ( 17) illustrates that, except for those of switching activity, all other parameters are user/manufacturer defined at a particular technology.Switching activity is the only parameter that needs to be estimated for technology-independent power optimization.Expected number of signal transitions at the outputs of the gates of a combinational logic circuit is defined as switching activity.This work follows the same procedure used in the reference [32] to estimate switching activity.Let us consider that initial inputs are uncorrelated and statically independent of each other, represented as: (18) The probability of the output of a gate when its inputs are changed from the previous state is estimated by: ( The switching probability follows the stationary random process, and probabilistic description does not change over a given period.Then, switching activity of logic gate (α g ) is given by: (20) The generalized expression for switching activity for an 'i' input AND gate (α AND ) with input switching probability '0.5' is given by: (21) Second level ON-probability of XOR gates may be computed by 'P • 0.5 i '.Where 'i' is the inputs realization of a function with 'P' ONterms.The probable switching activity of the node is given by: (22) The power consumption of a MPRM circuit is the sum of power of AND gates and XOR gates.Assuming that 'n' is the set of nodes in MPRM circuits, then the total switching activity is given by: ( 23)

C. Power Density
The amount of power drawn per unit area defines the power density of a circuit.It can be calculated by taking the ratio of total switching activity and area of the circuit.(24) Where, Pd MPRM , α total and A MPRM represent the power density, overall switching activity and total area of an MPRM realized network.The power density is estimated for a particular offspring chromosome to determine the thermal effect.Lower the power density better is the distribution of temperature among the different modules within a chip.This has also been verified by finding the absolute temperature (in °C) using Cadence and HotSpot tool.
IV. Non-Dominated Sorting Based Genetic Algorithm-II For Proposed Thermal-aware Realization Classical search techniques like genetic algorithm (GA) disperse the optimum solution throughout the search space and can find one optimal solution for a given weight combination in a single run when multiple objectives are there.All possible weight combinations are mandatory to go through to obtain the optimum solution.For which the execution time consumes much delay to find the optimum solution.An elitist non-dominated sorting based multi-criteria decision-making algorithm called non-dominated genetic algorithm-II (NSGA-II) is employed to overcome the above inconsistency.NSGA-II is a fast and improved multi-objective evolutionary algorithm (MOEA) with computational complexity O(XY 2 ), where 'X' is the number of objective parameters, and 'Y' is the population size.Fitness estimation or sharing parameters are replaced with the rank assignment and front selection using non-dominated sorting and crowding distance calculation in NSGA-II for better elitism and fast convergence toward an optimum solution.The detailed procedure of NSGA-II is discussed in [33].Configurable parameters, optimization objectives and constraints used for proposed algorithm are discussed elaborately in this section.

A. Chromosome Structure
Efficient chromosome structure can be encoded for an 'm' input combinational logic circuit by ternary bit string of length 'm'.The'm' input variables (l 1 , l 2 , l 3 , …, l m ) represents complementary, true and mixed polarity based on ternary operator bits {0, 1, 2}.If the p th bit is '1', it denotes that the p th input variable is implemented in true polarity whereas, if the q th and r th bits are '0'and '2' respectively, it symbolizes that the q th and r th inputs are realized in complementary and mixed polarity respectively.

B. Front Selection and Rank Assignment Based on Nondomination
Chromosomes in each front are assigned fitness based on their rank values or the front in which they exist.Chromosomes in the first front are designated with the highest rank value as 'one'(1) and individuals in the second are assigned the rank value as two (2) and so on.

Crowding Distance Calculation:Crowding
is another fitness parameter which depicts the density of a solution in a population.i dist can be calculated for each objective function by evaluating the Euclidean distance between individual chromosomes in a front by considering 'n' objective functions in the 'n' dimensional hyperspace.
2. Parent selection:A chromosome is selected as a parent if its rank is lesser than the other.If the ranks of chromosomes are same then the individual having higher crowding distance is selected.The selected parent chromosomes generate next-generation chromosomes using crossover and mutation operators.

C. Genetic Operators
Crossover and mutation are the two inherent mechanisms of the NSGA-II algorithm.They introduce the variation within the generated offspring and converge the output solution towards the optimal solution.It is observed from the literature that better offspring is generated by considering 90% crossover and 10% mutation in NSGA-II based multi-objective evolutionary algorithms [8].For the proposed approach, the same method is followed.However, three other experiments (70% crossover and 30% mutation, 80% crossover and 20% mutation, 100% crossover) were carried out by varying crossover and mutation percentage but it has been observed that more diversity in population is there if 90% crossover and 10% mutation is considered and good result is obtained.
1. Crossover: During crossover operation, two-parent chromosomes 'x' and 'y' from the initial population mates to produce two new offspring 'co 1 ' and 'co 2 ' at randomly selected crossover points.Two-point crossover methods converge the solution faster towards the optimum solution than that of single-point crossover.Parent chromosomes selection is biased towards the chromosomes with better fitness value ('elite group').Chromosomes with rank one (1) are considered as elite group.The selection of parent chromosomes from elite groups or from entire population to participate in crossover operation depends on the generation of a uniform random number between '0' and '1'.If the number is greater than or equal to '0.5' then the parents for crossover is chosen from the elite group; otherwise, parents are selected from the entire population.
The threshold for elite group is considered as '0.5' for selecting best-fit chromosome to participate in the crossover operation to generate better offspring.Let, the size of population is 'p' and the cardinality of elite group is 'q'.Then the probability of selecting a chromosome from elite group is 0.5/q + 0.5/p.Whereas, probability of chromosome selecting from entire population is 0.5/p.The probability of selecting chromosome from elite group is more than that of entire population, because 'q' is much smaller than 'p'.This method selects best-fit chromosomes to participate in the crossover operation and generates better offspring as compared to truly random one [34].Two crossover positions (cp 1 and cp 2 ) are randomly selected within the chromosome string length, and the alleles are exchanged between the two selected individuals as shown in Fig. 2(a) and 2(b).Fig. 2 (a) and (b) show the different outcomes of the same parent chromosomes using crossover operation method 1 and method 2 respectively.A check is made after each generation with the already generated chromosomes, and duplicate chromosomes are eliminated.

Chromosome (x)
Chromosome (y) Chromosome (co 1 ) 0 2 2 1 1 0 2 0 2 2 2 0 1 2 0 0 1 2 0 1 1 2. Mutation: Mutation enables the genetic diversity from generation to generation.Mutation prohibits falling off all solutions in the population into a local optimum.10% of the 'N' offspring population is contributed by mutant chromosomes using mutation.Mutation operation is performed by selecting few random bit positions called mutation points (mp) and the polarity of that selected position is altered by the roulette wheel selection methodology as shown in Fig. 3(a).To increase the randomness, the mutation points are chosen randomly within a range of 1 to 'n' (where 'n' is the length of the chromosome).For an example case, a chromosome (m) is participating in mutation operation from the present generation; randomly three positions are selected as a mutation point (mp 1 , mp 2, and mp 3 ).Inter-conversion of polarity that is, the positive, negative and mixed polarity is done using roulette wheel criterion and remaining bits get unaltered.The newly generated offspring becomes the chromosome of the next generation.A random number (R n ) between '0' and '1' is generated for each mutation point, and if the generated random number (R n ) is greater than or equal to '0.5', the wheel position moves clockwise otherwise anti-clockwise.Depending on the elevated position, the polarity of the mutation point will change.Choosing a random number based on some prior information (like range, mean, variance, etc.) is a convex optimization problem (which is determined by entropy of objective function).With finite range, maximum entropy is given by uniform probability distribution function.Other distributions will have less entropy than the uniform probability distribution in the same range.
The proposed NSGA-II algorithm contributes 'N' chromosomes using the selected parents by crossover and mutation methods.Generated 'N' offspring and 'N' parents contribute as '2N' numbers of next-generation population.

V. Results and Related Discussions
Proposed thermal-aware mixed polarity AND-XOR realization of logic circuits have been implemented using NSGA-II in LINUX based C++ platform on a Pentium IV machine with 3-GHz clock frequency and 4-GB RAM memory.The algorithm is applied to MCNC and LGSynth93 benchmark suite [35] for experimentation.In the proposed work, NSGA-II based optimization approach is proposed for simultaneous reduction of area, power, and temperature.We have targeted logic level for optimization to reduce the cooling cost through heat-sink.At logic level, absolute values of area, power, and temperature are unknown.Therefore, for area reduction, reduction of product term is considered.To optimized power, switching activity is reduced.And for temperature reduction, power density is reduced in the cost metric.NSGA-II provides the Pareto optimal solution set consisting of area best, power best, power density best and optimum solution considering all the three parameters.To obtain the actual silicon area (in µm 2 ), power dissipation (in nW) and absolute temperature (in °C) Cadence (Genus and Innovus) and HotSpot tools are used.Cadence and HotSpot Electronics Design Automation (EDA) software packages are involved for simulating the digital and analog circuits.We have generated the graphical design specification information interchange (GDS-II) report after layout design for best and optimum solutions obtained using the proposed algorithm for each benchmark circuit, but there is no hardware implementation (chip fabrication) of the circuit.The total discussion of result is divided into two sections.The first section of result concerns the area, power and power density based result using NSGA-II approach.The next section briefly describes the implementation of physical design at 45 nm technology using Cadence Genus and Innovus Implementation tool.Then absolute temperature estimation using HotSpot tool is presented.

A. Result Based on NSGA-II
The '.pla' based circuits of MCNC and LGSynth93 benchmark suit are considered as an input circuits which are to be optimized in terms of area, power and temperature optimization.The circuits are decomposed into MPRM expansion based on input variable polarity encoding as explained in chromosome structure.NSGA-II is used to find efficient chromosome polarity based on area, power, and power density.Twenty (20) benchmark circuits are tested for experimentation.Table II gives the parameters and evolution operator's settings for the proposed NSGA-II based approach.To verify the efficiency of the proposed approach using NSGA-II, the proposed best and optimized results of MPRM circuits are compared with previously published best and optimum results of FPRM [36], Shared Reed-Muller Decision Diagram (SRMDD) [22], MPRM [10], AND-Inverter Graphs (AIGs) [37] and GA based FPRM [38] decomposed circuits.A set of solutions (called Pareto optimal solutions) are obtained comprising of the individual best solutions ('Area Best', 'Power Best' and 'Power density Best') and 'optimum solution'.An area comparative study of the proposed approach with FPRM, SRMDD, MPRM, AIGs, and GA based FPRM is presented in Table III.For power comparison, the proposed method is compared with FPRM, AIGs and GA based FPRM solutions, which are reported in Table IV.For power density based comparison, the proposed power density solutions are compared with SRMDD and AIGs based solutions and reported in Table V.In Tables III, IV and V, the first column shows the circuit name with which experimentation is carried out.The second and third columns of Tables III, IV and V represent the proposed best and optimum solution for area, power and power density, respectively.The "Save Best" and "Save Opt" columns in Tables III, IV and V shows the percentage savings of the proposed approach with respect to the existing works reported in the literature.The average percentage saving is calculated and reported in the last row of Tables III, IV and V.The percentage savings referred to in the column "Save Best" and "Save Opt" of Tables III, IV and V are calculated by the following Eq.( 25) and ( 26).(25) The percentage savings for the proposed best solution and proposed optimum solution are represented by "Save Best" and "Save Opt" as referred in Eq. ( 25) and (26), respectively.The "Best solution", "Optimum solution" and "Existing solution " represent the proposed NSGA-II based best solution, NSGA-II based optimum solution and existing reported works of literature, respectively.
From Table III, it is observed that the chromosome with the proposed area best solution of NSGA-II save 28.61%, 22.85%, 29.45%, 30.45% and 35.19% area compared to that of FPRM, SRMDD, MPRM, AIGs, and GA based FPRM based results respectively.When the proposed optimum solution comparative study is performed with respect to FPRM, SRMDD, MPRM, AIGs, and GA based FPRM results then proposed optimum solution shows an area saving of 18.42%, 10.92%, 15.94%, 6.86% and 23.51% compared to that of FPRM, SRMDD, MPRM, AIGs, and GA based FPRM based results, respectively.28.71%, 85.35% and 50.27% more power with respect to the proposed best power-based solutions respectively.When proposed optimum power-based solutions are compared, then the FPRM, AIGs, and GA based FPRM solutions consume 11.72%, 81.19%, and 41.96% more power than that of the proposed solutions, respectively.The best power density based solutions show 23.68% and 69.83% reduction in power density compared to that of SRMDD and AIGs based solutions in Table V.The optimum power density based solutions provide 10.90% and 70.02% better results than that of SRMDD and AIGs based solutions respectively.Fig. 4 shows the Pareto-optimal graph for "table3" benchmark circuit.

When power-based realization is compared in
The solutions nearer to the origin form the Pareto-optimal front, which are optimal with respect to area, power, and power density.The solutions nearer to each axis represent the best solutions.The last column of Table V reports the total CPU time required (in CPU seconds) to execute the algorithm in an identical platform.

B. Physical Design Implementation At 45Nm Technology
At logic level, evenly distributed or average power density is considered.NSGA-II algorithm is used to determine the optimum input variable polarity based on area, power, and power density.Initially, the dynamic power is estimated by calculating the switching activity, and the area is estimated by calculating the total number of product terms.To calculate the power density for a particular logic, the ratio of power to area is considered.Then the optimized realization is synthesized using Cadence Genus digital design platform.The synthesized netlist is implemented to have physical design at 45nm technology using Cadence Innovus platform.After physical design realization, Innovus generates the floorplan information (.flp file) and power profile (.pptrace file).In floorplan information, the synthesized logic is represented with different modules with their height, width, X and Y coordinates to allocate the position of a particular module within the chip.In the power profile, the power dissipation information of each module is given.The floorplan information and power profile are given as input to the HotSpot tool for generating the temperature profile.Based on floorplan information and power profile given, the HotSpot tool generates the temperature profile for each module in degree centigrades (°C).Fig. 5 shows the schematic flow-diagram of temperature generation using HotSpot tool.For an example case, the floorplan information and power profile of "rd53" benchmark circuit is shown in Fig. 6 and Fig. 7, respectively.The corresponding floorplan generation is shown in Fig. 8.The temperature profile generation using HotSpot tool using the floorplan information and power profile is shown in Fig. 9.     Cadence (Genus and Innovus) and HotSpot tools are electronic design automation tools used for simulating the digital and analog circuits.We have generated the GDS-II report for best and optimum solutions for each benchmark circuit.Netlist, Synopsis Design Constraints (SDC) library and Library Exchange Format (LEF) files at 45nm technology are provided as input to the Cadence tool.The above process generates floor-plan information (.flp) and power profile (.pptrace), which act as input to the HotSpot tool for calculating absolute temperature profile.Thermal packaging used in HotSpot tool to generate temperature profile are ambient temperature (45.5 °C), chip thickness (0.15mm), convection capacitance (140.4J/K), convection resistance (5 K/W), heat sink side (60mm), heat sink thickness (6.9mm), spreader side (30mm), spreader thickness (1mm), chip to spreader interface thickness (0.020mm).The dynamic thermal management (DTM) approach is applied to the proposed method by the HotSpot tool.The HotSpot tool has an in-built thermal management technique, where the threshold thermal value can be set to restructure the model to trim down the peak temperature.By default, it is 82°C, so we kept the threshold value as same for our realization.If for a particular placement of logic cells, depending on its power value and location of each cell, temperature increases beyond 82°C, then thermal model of HotSpot tool dynamically changes the relative placement of the cells such that temperature becomes below 82°C.This technique is called "dynamic thermal management" of HotSpot tool.NSGA-II provides the Pareto optimal solution set consisting of best and optimum solution based on area, power and power density.Only the solutions with the best area, best power, best power density and optimum solution consisting of area, power and power density (4 solutions) are processed further for physical design implementation using the Cadence tool.NSGA-II optimized circuits are driven into the physical design synthesis level and area (µm 2 ), power (nW) and temperature (°C) values are reported in Table VI.
Second, third and fourth columns with 'Best_area', 'Best_ power' and 'Best_peak_Temp' report the standard cell area, power consumption and peak temperature generated by best area, best power and best power density solution of NSGA-II respectively.The next three columns report the same for the optimal solution.Comparative analysis with SRMDD and espresso decomposed AND-INVERTER GRAPH (AIG) structure [22], [37] is reported in Table VI.The last column of Table III, indicates the maximum CPU time (in seconds) to implement a benchmark circuit among all the cases (best solutions and optimal solution) in an identical platform.The average percentage savings referred in the last three rows of Table VI is calculated by the following equation.(27) Average percentage savings is represented by "Savings Average " as referred in Eq. (27)."Proposed solution " and "Earlier solution " represent the proposed approach based solution and earlier literature reported solutions, respectively.VI.Fig. 10, 11 and 12 depict that the best solution and optimum solution save 75.21% (76.20%) and 73.69% (74.82%) standard cell area than that of SRMDD-based best solution (SRMDD-based optimum solution), respectively.The best peak temperature and optimum peak temperature are reduced by 13.52% (17.06%) and 12.49%(16.08%)than that of SRMDD-based best solution (SRMDD-based optimum solution) respectively.Best area, best power and best peak temperature based solutions of MPRM expansion save 8.80%, 29.09% and 3.89% area, power, and peak temperature respectively when compared with espresso decomposed AIGs structure-based solutions.When optimal solution from Pareto-Optimal solution set is compared, it shows the average savings of 26.20% power and 2.70% peak temperature than that of espresso decomposed AIGs structure solutions at the cost of 4.39% increase in area.Fig. 10.Average percentage savings of the proposed approach w.r.t.SRMDD best solutions [22].Fig. 11.Average percentage savings of the proposed approach w.r.t.SRMDD optimum solutions [22].Fig. 12.Average percentage savings of the proposed approach w.r.t.AIGs circuit decomposition [37].

VI. Conclusion and Future Works
This paper proposed an NSGA-II based input variable polarity selection of MPRM expansion for thermal aware realization.Area, power, and temperature are considered simultaneously as objective parameters.Product terms are considered as representative area, and switching activity is considered as the power consumption at logic level.Power per unit area (Power density) is taken as the temperature metric to estimate the effect of temperature.The input polarity of MPRM is chosen such that all the parameters are optimum.To find the non-dominated optimal solution based on input polarity of MPRM circuits, NSGA-II based approach is performed and Pareto optimal solution set is reported.
The proposed results are compared with FPRM, GA based FPRM, SRMDD, MPRM and AIGs based solutions; and significant reduction in area, power and power density generation is observed.Finally, NSGA-II based solutions are implemented using CADENCE tool at 45nm technology to obtain on-chip silicon area and power consumption.The floorplan information and power profile are used to get the absolute temperature generated by a particular logic circuit in degree Celsius using HotSpot tool.Maximum 76.20% saving in area, 29.09% saving in power and 17.06% reduction in peak temperature are observed using the proposed MPRM approach with respect to earlier reported works.
The future research is aimed to figure out the correlation between the ageing aware with the thermal aware design and to find an optimum solution to realize a circuit using MPRM expansion.

Fig. 1 .
Fig. 1.(a) '.pla' file representation of Boolean function ('i', 'o' and 'p' represent number of inputs, number of outputs and product terms respectively); (b) Input variable encoding ('x' and 'y' are in mixed form; 'w' is in complementary form and 'z' is in true form).

Fig. 3 (
Fig. 3(b) illustrates the operation of the roulette wheel criterion.A random number (R n ) between '0' and '1' is generated for each mutation point, and if the generated random number (R n ) is greater than or equal to '0.5', the wheel position moves clockwise otherwise anti-clockwise.Depending on the elevated position, the polarity of the mutation point will change.Choosing a random number based on some prior information (like range, mean, variance, etc.) is a convex optimization problem (which is determined by entropy of objective function).With finite range, maximum entropy is given by uniform probability distribution function.Other distributions will have less entropy than the uniform probability distribution in the same range.

Fig. 5 .
Fig. 5. Schematic Flow-diagram of temperature profile generation using the HotSpot tool.

Fig. 9 .
Fig. 9. Temperature profile of the "rd53" benchmark circuit generated by the HotSpot tool.

Fig. 10 ,
Fig. 10, 11, and 12 show the average percentage improvement of the

TABLE I .
Input Variable Polarity Transformation Using Tabular Technique

TABLE II .
Parameters and Evolution Operator's Settings for the Proposed NSGA-II Approach Table IV, it is observed that FPRM, AIGs, and GA based FPRM solutions consume

TABLE III .
Area Comparative Study Of the Proposed MPRM Realization