Multi-Valued Quantization Neural Networks toward Hardware Implementation

This paper proposes a Multi-Valued Quantization (MVQ) method of connecting weight for efficient hardware implementation of Convolutional Neural Networks (CNNs). The proposed method multiplies an input value by a multi-valued quantized weight during the forward and backward propagations, while retaining precision of the stored weights for the update process. In the both propagation processes, multipliers can be replaced with adders and shifters by setting appropriate quantized weights. We train twoto six-valued quantization CNNs with MNIST and CIFAR10 dataset to compare the performance of them with a 32-bit floating point CNN. In the four-valued quantization, random noise is added to the quantized weight to improve the performance of generalization ability. In addition, the robustness of MVQ CNN to noise is evaluated. Experimental results show that the MVQ CNNs achieve better learning accuracy than the floating point CNN and the four-valued CNN is highly robust to the noise.


Introduction
Convolutional Neural Network (CNN) 1 is a kind of neural network that have proven significant improvement in image recognition and computer vision fields.Thus, the implementation of CNN in embedded system raises high expectation for real world application.
Real time processing and low power consumption are significant criteria for embedded systems.However, it is difficult to achieve them by using only software implementation.In contrast, a special purpose hardware architecture of CNN enables parallel and pipeline processing that improves processing speed and reduces power consumption.However, the repetition of matrix computation, convolution of input vectors and weights, need a lot of multipliers that require huge hardware resources.Therefore, a simplification method is desired as the limitation of hardware resources.One of the method that simplify the processing is quantization. 2,3 inaryConnect is an approach to simplify neural networks by binarizing the weights. 2 By the quantization, hardware resource can be reduced by replacing multiplier with adder and subtractor.However, the recognition accuracy may drop by the quantization.Therefore, the verification of the relationship between the quantized bit rate and the accuracy of CNN is necessary in order to ensure the performance of quantization.
In this paper, we propose a Multi-Valued Quantization (MVQ) of weights for efficient hardware implementation of CNN.The recognition accuracy of CNN is compared between the proposed method with five different types (two-six quantized weights) and the conventional CNN with 32-bits floating point, using MNIST and CIFAR-10 dataset.In addition, we desire to be robust against noise in hardware implementation.Therefore, noise is applied to the MVQ-4, which shows the highest recognition accuracy among MVQs, and the change in result against noise is observed.

Related research
The operations of neural networks can be divided into forward propagation, backward propagation, and update process of the weight value.BinaryConnect binarizes the weight value to (+1,-1) at the forward and backward propagations while continuous-value are stored.During the update process, the update parameter is calculated through the stored continues-value.Via the binarization of the weight, the multipliers are simplified to the adder and subtraction, thus it is efficient in reducing the circuit size.Especially during propagation, the multiplier is completely removed.Moreover, the generalization performance of binarized CNN is improved and exceeded the recognition accuracy of the regular CNNs.

Proposed method
In this paper, we proposed the MVQ method, which extends binarization to multi value.

MVQ
MVQ is a method that quantizes the weights into smaller division than BinaryConnect.When the quantized bit rate is larger, the quantized weight is closer to the real valued weights, but losing the merit of simplification of calculation.It is important to balance between the quantized bit number and the computational cost.In this paper, the quantized value is changed from three-value to six-value, and the recognition accuracy results are observed.

Quantization method
With MVQ-4, the algorithm of the quantization is explained in detail.The quantization can be done by either stochastic or deterministic approach. 2In this paper, the deterministic approach is employed.The original weights W are clipped to the clipping point w_range to quantize the weights into new quantized values before forward and backward propagations.The multi-valued weight is formulated as in equation below.
The other MVQs follow similar quantization manner as shown in Eq. (1).In this paper, we employ stochastic gradient decent (SGD) with MVQ for training CNNs and it is shown in Algorithm 1.

Addition of noise
To observe for the stability and robustness to the noise, the addition of noise are also done during the experiments.The noise is generated using the uniform random number generator and insert into layers during forward propagation, backward propagation and update process, as shown below.

Forward propagation:
For 1 to , compute ,

Backward propagation:
Initialize output layer's activations gradient For = to 2, compute knowing

Experimental results
Experiments are carried out by modifying Tiny_CNN (now known as Tiny_DNN) 4 from the CNN header library.This library uses 32-bit floating point in usual.Here, w_range is set to 0.7 for all MVQs and Algorithm 1 is applied for the training.

MNIST dataset
MNIST dataset contains grayscale images with the size of 28x28 pixels, which has 10 different classes of handwritten number zero to nine. 5 It includes 60,000 train image data and 10,000 test image data.The architecture of CNN is based on LeNet-5 5 as shown in Fig. 1.The result of learning is shown in Fig. 2. From the experimental results, the MVQ-4 achieved the best result of all.

CIFAR-10 dataset
CIFAR-10 dataset contains color images with the size of 32x32 pixels, which has 10 different classes of objects such as animals and vehicles.CIFAR-10 dataset includes 50,000 train image data and 10,000 test image data.Figure 3 shows the CNN architecture for CIFAR-10 dataset.Figure 4 shows the result of learning.
Similar to the experimental results of MNIST dataset, the MVQ-4 also achieved the highest accuracy.

Addition of noise
Noise addition experiments are only carried out with the MVQ-4, since it achieved the best result for both of MNIST and Cifar-10.Figure 5 shows the results of every position of noise addition.From the result, adding noise to the error during backward propagation was the best result.
To measure the noise tolerance, we increased the noise range throughout the experiment to observe the effect.As the result, the proposed method is robust to the noise up the range around 1.0 in backward propagation.The largest value of weights in MVQ-4 is 0.999643, thus the proposed CNN seems to withstand the noise equal to the maximum of weight value.

Conclusion
This paper proposed the MVQ method, and investigate the relationship of the degree of multi-value to the CNN recognition accuracy.From the experimental results, the proposed method showed higher accuracy than the conventional CNN with 32-bit floating point operation, and among the different multi quantized value, MVQ-4 possessed the best result in accuracy.Furthermore, the experiment of noise addition into MVQ-4 was carried out to observe the relationship between noise position and recognition accuracy.， : ℎ ， : backward propagation in MVQ-4 showed the best result compare to insertion in other positions.In addition, the robustness of proposed method to noise was verified.
In future work, we will design a special purpose hardware architecture based on the proposed method for embedded systems.

:
Dot product in ℎ layer ， :The number of layers ， Mini batch scale:10, , :previous parameters weights and biases， : learning rate, C:cost function P -133

Figure 1
Figure 1 Construction of CNN in MNIST dataset Figure 2 Results of MNIST dataset