A Neural Network trained to perform as an Imprecise 4:2 Compressor using Machine Learning Algorithm

In many fields, such as speech recognition, image processing, Internet of Things (IoT) systems, arithmetic circuits, etc., artificial neural networks have become a popular solution for a wide range of problems. In this research work, a 4:2 compressor-based arithmetic circuit is considered and an approximation technique called probabilistic pruning is applied to it. This proposed compressor is compared with various existing approximate 4:2 compressors. Training of a neural network as proposed inexact 4:2 compressor is performed using a supervised machine learning algorithm. The accuracy values of the proposed compressor are acceptable at architectural and neural network levels. The absolute difference between the train set and the predicted test set accuracy values obtained is very less (1.73%) when compared with the other (inexact 4:2 compressors) trained deep neural network models.

IOP Publishing doi: 10.1088/1757-899X/1042/1/012014 2 presented a study on combining three approximation techniques on more than one layer of system stack to get more benefits and said that benefits can be applied to different domains.
The authors of [6] have presented the way they have developed the spanning circuits, architecture, and software by applying techniques of approximate computing. In [7][8], 4:2 compressors were proposed and they were placed in a Dadda multiplier to get low power multipliers. Finally, these multipliers were used in image processing applications.
Due to approximations on digital circuits, the difference in the outputs of exact and inexact circuits can be calculated from the equations proposed in [9]. The work of [10] also resembles the same as [7][8] but the authors have used different operand lengths of approximate multipliers and image filtering application was shown in their research work. The ML techniques applied to CAD algorithms were given in [11] to make automation in the operation in the tools. In [12], the authors have proposed an approximate compressor using inexact logic minimization and embedded it in the multiplier for power efficiency and finally, an algorithm was given to simulate them in Google Colaboratory.

Proposed method
This section presents a proposed inexact 4:2 compressor employing a probabilistic pruning type of approximation. A feed-forward neural network is trained to behave as the proposed 4:2 compressor.

Architectural approach of proposed 4:2 compressor
The probabilistic pruning type of approach is applied to the exact 4:2 compressor by removing a multiplexer present at the output of Sum [1]. Thus, the Sum output for the proposed 4:2 compressor can be taken from the multiplexer whose select line is connected to the output of the second XOR [1]. With this approximation, there is no difference in the outputs of Cout and Carry. The proposed approximate method gives the least number of errors with less transistor count. The Sum expression for the proposed 4:2 compressor is obtained as equation (1).
The transistor-level circuit of the proposed 4:2 compressor is shown in Figure 1. Minimum numbers of errors are present in the above circuit. i.e., Cout and Carry have `0' errors and the Sum output has `8' errors which give a 25% error rate considering all the three outputs. When the approximate design of [12] is examined, it also has a 25% error rate, but it is obtained considering only Sum output. The Cout and Carry outputs of [12] are giving 8 and 4 errors respectively when compared with the exact 4:2 compressor. Thus, the total errors of the proposed inexact circuit have `0,  [12] has `8, 4, 8' for Cout, Carry, Sum outputs respectively. With this least error rate, the inexact design was embedded in a high speed, parallel Dadda multiplier to reduce the power consumption of the multiplier, and intern the Arithmetic Logic Unit (ALU) and processor.

Neural Network model of an Inexact 4:2 Compressor
A supervised machine learning type of algorithm is used for training the neural network. The truth table of the proposed inexact compressor is given as dataset to the NN with Cin, X4, X3, X2, X1 inputs, and Cout, Carry and Sum target labels. The architecture of the NN with inputs and outputs is shown in Figure 2. The truth table is divided into train and test sets to find the accuracy of the NN model. The accuracy value obtained from the train and test sets describe how best the NN model is performing as the proposed inexact compressor. The difference between the trained accuracy and the predicted test accuracy should be equal in ideal cases for the NN model to be complete. When these accuracy values are equal then the model is said to be completely validated.

Results and Discussion
The experimental work of 4:2 compressors was performed using the Spectre simulator of Cadence Virtuoso for schematic simulations. To find the area of these architectures Cadence Layout XL was used in 45nm CMOS technology node. The error analysis of compressors was performed using MATLAB. The training part of the NN was done using the Jupyter notebook with Pandas, Numpy, and Keras libraries.

Simulation Results of 4:2 compressors
In this subsection, the proposed inexact 4:2 compressor is compared with the exact and existing approximate 4:2 compressors. The comparison has been done in terms of transistor count, power, energy, and area for exact, existing inexact, and proposed inexact 4:2 compressors. The simulation results of the compressors are tabulated in Table 1.  When the proposed compressor is compared with the design1 of [7], the improvement in power, energy, and the area is 18.72%, 63.05%, and 42.36% respectively. From Table 1, it is observed that the power, area, and energy parameters are varying according to the transistor count of each 4:2 compressor. The graph is shown in Figure 3 also explains that these parameters are varying for the transistor count.   Table 2 are  The proposed inexact 4:2 compressor and [12] have the same error rate. This is because in [12] only sum is considered for error rate whereas in the proposed circuit the error rate is 25% considering all the outputs (0+0+8 for Cout, Carry, and Sum).  Although the transistor count is very less, the design1 of [7] was giving a 37.5% error rate which is higher than the rest of the circuits, .  The design of [8] has only 4 inputs Cin, X4, X3, X2, X1 inputs, and two outputs (Carry and Sum). Hence error analysis has been performed for 16 inputs instead of 32 inputs. Therefore, the error rate is 31.25%.

Simulations of a deep neural network model
Training of NN has been done by splitting 66% of the dataset into a train set and 34% of the dataset into a test set. The parameters used for building the NN is given in Table 3. The absolute difference between the accuracies of train and test sets is 1.73%. This difference suggests that the NN model is perfectly built as a proposed inexact 4:2 compressor. The accuracy and loss graphs for the number of epochs are shown in Figure 4. To compare train and test accuracies of the other approximate designs, NN models are built using datasets, and accuracies are obtained as shown in Table 4. From Table 4, it is clear that the proposed inexact 4:2 compressor is modeled perfectly since the absolute difference between train and test set accuracies obtained are higher for other models when compared with the proposed one.

Conclusion
Our findings suggest that the proposed inexact method can be applied to any arithmetic combinational circuit in terms of the architectural and training of neural network approaches. The proposed inexact 4:2 compressor model has the least absolute difference between a train and test accuracies which proves that the NN model is validated completely. The power and area consumption of the proposed compressor is also less with the least error rate.