System-Level Leakage Power Estimation Model for ASIC Designs

Nowadays, with advances in CMOS technology and sub-micron processes, the leakage power dissipation is becoming a critical design metric. The designs are getting complex to incorporate more functions, thereby increasing the leakage power dissipation. The low power design objective requires early exploration and estimation. This paper presents the power estimation model for ASIC (Application Specific Integrated Circuit) based designs at the C-level of abstraction. The method includes analysis and extraction of the application specific information from the LLVM (Low-Level Virtual Machine) bit-code and then training of the neural network. The trained model provides the estimation of the leakage power. The estimated leakage power of designs is compared with the implemented power to demonstrate the accuracy. In addition, the model provides fast estimates and eliminates the need for synthesis based exploration.


Introduction
Modern applications require large on-chip functions which results in increased design complexity.Currently, leakage power by such complex designs, makes power budget a critical issue.In the ASIC design, the leakage power of cells, such as AND, NAND, MUX increases as technology node decreases.Complex functional design and implementation require a number of combinational and sequential cells.The sub-threshold leakage, gate-oxide tunneling leakage, and reverse bias drain-substrate leakage are the primary sources of leakage power dissipation [1].The leakage power remains present in active as well as in the idle time.Thus, it is a primary concern for the designers.Since changes in the designs at a later stage is difficult.
The leakage power can be determined very quickly at the system-level.This reduces the design cycle time.Modeling of designs at the system-level provides an opportunity to explore the design metrics at the initial phase.Modeling of HW and SW was carried out at the cycle accurate level [2], instruction level [3] and functional level [4] to get the power estimates.However, there is a need of a high-level model that can provide quick as well as accurate estimates of design power at the high-level of abstraction.Therefore, this work presents a model to get an accurate early power estimate at the system-level.The LLVM IR and ANN are used for the profiling and the modeling, respectively.The rest of the paper is organized as follows.Section 2. discusses related work.Section 3. discusses the profiling and modeling methodology.Section 4. presents the implementation and the results and finally, Sec. 5. presents the conclusion.

Related Work
Sources of the leakage current and reduction techniques were discussed in [1].These techniques are proposed at the circuit-level and process technology level.Thus, to decrease the design time, estimation should be done at the high level of abstraction.
The leakage power for ASIC designs was estimated in [5].The linear regression model, with a maximum error of 12 %, has been reported for the HDL description of the circuits.
In [6], early power estimation model was presented for the FPGA based designs with better accuracy.However, it does not provide any model for ASIC based designs.
In [2], power was evaluated by observing the cycle by cycle operations and exchanges.However, this work models the static power at the cost of the accuracy.
A SystemC class library, PowerSim, was proposed in [7], to calculate the energy consumption of hardware described at the system-level.It was based on monitoring the C++ operators when called on SystemC data types at the time of simulation.However, PowerSim requires a change in SystemC library and recompilation of source code.
The IO ports power consumption of OpenRISC architecture was modeled in [3].IO ports power was modeled using the types of instructions such as load and store, and the cache miss rate.Since the effect of the adjacent instruction was not considered, the maximum error of 15 % was reported.
Open-PEOPLE (Open Power and Energy Optimization and Estimator), an estimator, was presented in [8] with the integration of estimation tools from the functional level to real board measurements, and a maximum error of 22.5 % was reported.
The HLS based, a fast system-level power estimation method was presented in [9].However, a maximum error of 9 % was reported for the DSP based designs.Similarly, a neural network based linear and nonlinear system-level power estimation models were presented in [10].However, the maximum error of 31 % and 4.78 % were reported for the linear and nonlinear models, respectively.
A System-level power estimation model was proposed in [11] for the wireless communication systems, implemented on the FPGA.However, it takes a significant amount of time because of the low-level characterization.
In [4], learning methods based on high-level energy models were used for the processors.Host compiled modeling approach instead of instruction level simulation was used.The observed maximum estimation error is 10 % for considered applications GEMM, DCT, and HDR.
Nowadays, with the help of high-level synthesis [12], [13] and [14] tools, it is possible to start the design space exploration at the system-level and synthesize the behavioral description into the RTL model.The RTL model of any given application simulated with a test bench provides a power estimate.PETS, a simulationbased tool, was presented in [15].The power was estimated by running an application on the embedded platform.However, it does not generalize the power model and requires a new power model for a new architecture.
PK tool 2.0, a power estimation model, was introduced in [16] for the executable designs.It directly extracts the energy consumption from SystemC sc_module, which contains the system to examine.Since the SystemC sc_module was divided into the different power states representing the operative conditions and the set of instructions, it was required to provide the power states information to the tool.
To summarize, all of the aforementioned approaches are time-consuming, inaccurate, and complex.The power estimation is difficult at the system-level due to the lack of the information and may lead to the inferior precision.Consequently, it prevents a fast and an efficient design space exploration.
Therefore, this paper presents a system-level power estimation model to overcome the aforementioned challenges.The profiling phase uses the analysis of the LLVM IR (bit-code) format of the applications.The LLVM IR is a static single assignment [17] and an easier representation to analyze than the machine level representation of applications.This leads us to collect the application specific information from LLVM IR through static code analysis.

Profiling and Modeling Methodology
LLVM bit-code files were generated by the Clang compiler.Bit-code analysis flow is shown in Fig. 2. The bit-stream files contain tags and nested structures.Blocks in a bit-stream denote nested regions of the stream and are identified by a content-specific ID number.The block IDs 0-7 are reserved and the IDs greater than 7 provide the application specific information.The information from these blocks used as the inputs to the proposed model, is shown in Tab. 1.The training data set was prepared to train the ANN-based model after profiling of the applications [18].The feed-forward-back-propagation neural network is selected for the training.Input dataset with corresponding target data is applied to the network for the supervised learning and training.The errors are generated by comparing the training output with initially applied target data to the network.The weights of each connection get updated according to these errors.Then the network is trained with the updated weights.The trainlm and learngdm are the selected training algorithm and transfer function, respectively.T ansig and purelin are activation functions for the input and the output layer neuron, respectively.The network configuration and training parameters are shown in Tab. 2.
Tab. 2: Network configuration for leakage power estimation.

Implementation and Result
In this section, estimation results are presented for ASIC based designs.A total of 74 applications were used for the training.After that, the trained model was validated for the different benchmark applications.The information obtained from the LLVM IR was applied to the proposed neural network model.The proposed neural network model is shown in Fig. 3.
Input layers have 7 inputs and 7 neurons.Transfer function for input neurons is tansig.Transfer function for the second layer, which is the output layer, is purelin.The following equations represent the output of this neural network configuration.A n = tansig n=7 n=1 The output of this model is given by Eq. ( 2).In the Eq. ( 1) and Eq. ( 2), I n are the inputs, w na are the weights from the inputs to the hidden layer neurons, b Aa are the biases to the hidden layer neurons, w AaO are the weights of hidden layer neurons to the output layer neuron, and b OA is the bias to the output neuron.The proposed ANN-based model is applied to the CHStone [19] benchmark applications, mentioned in Tab. 3.
The size of the blocks is shown in Tab. 4 for the benchmark applications [19].
Tab. 4: Type and size of LLVM IR Blocks.

Validation and Comparison
In this section, we will be validating the estimated power against the power obtained from the commercial tool, in order to demonstrate the accuracy of the proposed methodology.Figure 4 shows the result validation and comparison flow.The output of the proposed model, estimated leakage power, is compared with the power obtained from the Synopsys Design Compiler.Designs are compiled using Synopsys Design Compiler SAED_90nm library.The error in evaluated power amends the weights of the input for the network in the training phase.Now, the trained network is ready to estimate the power for any given application.The estimated results are compared using the following equation.
In the above expression, error i is the percentage error in the estimate, e i is estimated power obtained from the presented methodology, and p i is power obtained from Synopsys DC, for an application i.The estimated results and comparisons are shown in Tab. 5 and Tab.6, respectively.The time comparison is shown in Tab. 7. The platform used for the time comparison is the Intel Core i3 @ 3.3 GHz processor.Since the LLVM profiling output comes in less than 1 second, only the training time of the ANN was considered.The total time, t 2 , taken by Synopsys DC, for the benchmark applications in this paper is 4 minutes and 53 seconds.However, through the proposed model, estimation time (t 1 ) is 58 times faster than the typical synthesis tool.

Conclusion
A system-level power estimation model was presented for the ASIC designs.The application-specific information was obtained through the analysis of the LLVM bit-code, then further employed to train the ANN.The estimation errors of 2.8 % to 5.1 % were observed.Moreover, the model was found 58 times quicker than the Synopsys Design Compiler.In addition, it is possible to extend this work for an early estimate of the area of the ASIC designs.

Tab. 3 :
Benchmark/Application description.Benchmarks Description gsm Linear predictive coding analysis of global system for mobile communications dfsin It implements IEC/IEEE standard double precision floating point sine function using 64-bit integer numbers dfdiv It implements IEC/IEEE standard double precision floating point division using 64-bit integer numbers c 2018 ADVANCES IN ELECTRICAL AND ELECTRONIC ENGINEERING
Tab. 5: Comparison of Estimated Leakage power with Implemented power.Time taken by Synopsys Design Compiler (DC) and proposed model to get power estimate for Intel Core i3 @ 3.3 GHz processor platform.